<html><head><meta http-equiv="Content-Type" content="text/html charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class=""><br class=""><div><blockquote type="cite" class=""><div class="">On Sep 12, 2014, at 2:24 PM, Owen Anderson <<a href="mailto:resistor@mac.com" class="">resistor@mac.com</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><meta http-equiv="Content-Type" content="text/html charset=utf-8" class=""><div style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class=""><br class=""><div class=""><blockquote type="cite" class=""><div class="">On Sep 12, 2014, at 10:27 AM, Dan Gohman <<a href="mailto:dan433584@gmail.com" class="">dan433584@gmail.com</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><blockquote class="gmail_quote" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; margin: 0px 0px 0px 0.8ex; border-left-width: 1px; border-left-style: solid; border-left-color: rgb(204, 204, 204); padding-left: 1ex;"><div style="word-wrap: break-word;" class=""><div class=""><br class="Apple-interchange-newline">More generally, I don’t see a compelling reason for LLVM to add intrinsic support for the version you’re proposing. Your choice can easily be expanded into IR, and does not have the wide hardware support (particularly in GPUs) that the IEEE version does.</div></div></blockquote><div style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><br class=""></div><div style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class="">The IEEE version can also be expanded in LLVM IR. And for GPUs, many GPU input languages leave the behavior on NaN unspecified, so it's not obviously the best guide.</div></div></blockquote></div><br class=""><div class="">That’s not generally true. HLSL (DirectX), CUDA, OpenCL, and Metal all have defined semantics for NaNs which include not propagating them through min/max. GLSL (OpenGL) is the odd one out in this area.</div></div></div></blockquote><br class=""></div><div>Also, as a practical issues, many GPUs have ISA-level support for the IEEE-conforming version. Some (all?) of the AMD GPUs that Matt cares about support it, and PTX has native operations for it as well. The IR expansion of an IEEE-conforming fmin/fmax is at least three compares + selects, which makes it very difficult to pattern match for these targets.</div><div><br class=""></div><div>The inverse form (always propagating NaNs) is not widely natively supported. I think AArch64 *might* have it? MAXPS in SSE performs a ternary operator form that doesn’t match either definition.</div><div><br class=""></div><div>—Owen</div><br class=""></body></html>