[LLVMdev] [PATCH][RFC]: Add fmin/fmax intrinsics
Dan Gohman
dan433584 at gmail.com
Fri Sep 12 17:39:49 PDT 2014
On Fri, Sep 12, 2014 at 3:04 PM, Owen Anderson <resistor at mac.com> wrote:
>
> On Sep 12, 2014, at 2:24 PM, Owen Anderson <resistor at mac.com> wrote:
>
>
> On Sep 12, 2014, at 10:27 AM, Dan Gohman <dan433584 at gmail.com> wrote:
>
>
>> More generally, I don’t see a compelling reason for LLVM to add intrinsic
>> support for the version you’re proposing. Your choice can easily be
>> expanded into IR, and does not have the wide hardware support (particularly
>> in GPUs) that the IEEE version does.
>>
>
> The IEEE version can also be expanded in LLVM IR. And for GPUs, many GPU
> input languages leave the behavior on NaN unspecified, so it's not
> obviously the best guide.
>
>
> That’s not generally true. HLSL (DirectX), CUDA, OpenCL, and Metal all
> have defined semantics for NaNs which include not propagating them through
> min/max. GLSL (OpenGL) is the odd one out in this area.
>
>
HLSL leaves it undefined:
http://msdn.microsoft.com/en-us/library/windows/desktop/bb509624%28v=vs.85%29.aspx
I guess Metal and others only have a "fast-math" flag which (among other
things) makes behavior on NaN undefined, but it's my impression that it's a
popular flag.
> Also, as a practical issues, many GPUs have ISA-level support for the
> IEEE-conforming version. Some (all?) of the AMD GPUs that Matt cares about
> support it, and PTX has native operations for it as well. The IR expansion
> of an IEEE-conforming fmin/fmax is at least three compares + selects, which
> makes it very difficult to pattern match for these targets.
>
It's 2 compares + selects:
float nan_swallowing_fmin(float a, float b) {
return b != b ? a : (a < b ? a : b);
}
which is within the realm of pattern-matching.
>
> The inverse form (always propagating NaNs) is not widely natively
> supported.
>
> I think AArch64 *might* have it?
>
It does. In fact, even armv7 has a NaN-propagating min/max:
http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0489i/CIHDEEBE.html
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20140912/932d64f4/attachment.html>
More information about the llvm-dev
mailing list