[LLVMdev] [PATCH][RFC]: Add fmin/fmax intrinsics

Dan Gohman dan433584 at gmail.com
Fri Sep 12 17:39:49 PDT 2014


On Fri, Sep 12, 2014 at 3:04 PM, Owen Anderson <resistor at mac.com> wrote:

>
> On Sep 12, 2014, at 2:24 PM, Owen Anderson <resistor at mac.com> wrote:
>
>
> On Sep 12, 2014, at 10:27 AM, Dan Gohman <dan433584 at gmail.com> wrote:
>
>
>> More generally, I don’t see a compelling reason for LLVM to add intrinsic
>> support for the version you’re proposing.  Your choice can easily be
>> expanded into IR, and does not have the wide hardware support (particularly
>> in GPUs) that the IEEE version does.
>>
>
> The IEEE version can also be expanded in LLVM IR. And for GPUs, many GPU
> input languages leave the behavior on NaN unspecified, so it's not
> obviously the best guide.
>
>
> That’s not generally true.  HLSL (DirectX), CUDA, OpenCL, and Metal all
> have defined semantics for NaNs which include not propagating them through
> min/max.  GLSL (OpenGL) is the odd one out in this area.
>
>
HLSL leaves it undefined:

http://msdn.microsoft.com/en-us/library/windows/desktop/bb509624%28v=vs.85%29.aspx

I guess Metal and others only have a "fast-math" flag which (among other
things) makes behavior on NaN undefined, but it's my impression that it's a
popular flag.


> Also, as a practical issues, many GPUs have ISA-level support for the
> IEEE-conforming version.  Some (all?) of the AMD GPUs that Matt cares about
> support it, and PTX has native operations for it as well.  The IR expansion
> of an IEEE-conforming fmin/fmax is at least three compares + selects, which
> makes it very difficult to pattern match for these targets.
>

It's 2 compares + selects:

float nan_swallowing_fmin(float a, float b) {
  return b != b ? a : (a < b ? a : b);
}

which is within the realm of pattern-matching.


>
> The inverse form (always propagating NaNs) is not widely natively
> supported.
>


>  I think AArch64 *might* have it?
>

It does. In fact, even armv7 has a NaN-propagating min/max:

http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0489i/CIHDEEBE.html
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20140912/932d64f4/attachment.html>


More information about the llvm-dev mailing list