<html><head><meta http-equiv="Content-Type" content="text/html; charset=us-ascii"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;" class=""><br class=""><div><br class=""><blockquote type="cite" class=""><div class="">On Nov 5, 2018, at 3:14 PM, Matthias Braun <<a href="mailto:mbraun@apple.com" class="">mbraun@apple.com</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><meta http-equiv="Content-Type" content="text/html; charset=us-ascii" class=""><div style="word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;" class="">That said,<div class=""><br class=""></div><div class="">1) I also think IR instructions (or builtins) must not change semantics based on the compilation target.</div><div class="">2) If anything, we could provide several builtins for the different target behaviors and let the frontend choose a performant one (assuming the frontend is aware of the target). We could even use TargetTransformInfo to let targets communicate back what is fastest.</div><div class="">3) The solution in 2) seems quite complex to me, we may just want to go for saturation behavior and let people take a small hit on X86. That's the price you pay if you eliminate undefined behavior...</div></div></div></blockquote>(This is of course meant in a sense to provide an additional builtin / or instruction flag to get the saturation behavior we should of course keep the default to produce UB for C/C++/etc.)</div><div><br class=""><blockquote type="cite" class=""><div class=""><div style="word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;" class=""><div class=""><br class=""></div><div class="">- Matthias</div><div class=""><div class=""><div class=""><br class=""><blockquote type="cite" class=""><div class="">On Nov 5, 2018, at 3:06 PM, Matthias Braun via llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org" class="">llvm-dev@lists.llvm.org</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><div style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;" class=""><br class="Apple-interchange-newline"><br class=""><blockquote type="cite" class=""><div class="">On Nov 5, 2018, at 2:41 PM, Thomas Lively via llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org" class="">llvm-dev@lists.llvm.org</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><div dir="ltr" class="">I would be interested in learning what the set of used semantics for float-to-int conversion is. If the only two used are 1) undefined behavior if unrepresentable and 2) saturate to int_{min,max} with NaN going to zero, then I think it makes</div></div></blockquote><blockquote type="cite" class=""><div class=""><div dir="ltr" class="">sense to expose both of those natively in the IR. If the set is much larger, I think separate intrinsics for each behavior would make sense. It would be nice to get rid of the wasm-specific intrinsic for behavior (2) and replace it with a target-independent intrinsic or IR, since this behavior is not actually particular to WebAssembly.</div></div></blockquote><div class=""><br class=""></div><div class="">For example:</div><div class=""><div class="">- X86 returns the "<span class="" style="font-family: Verdana;">indefinite integer value (80000000H)" (regardless of overflow in the positive or negative direction)</span></div><div class=""><span class="" style="font-family: Verdana;">- AArch64 saturates towards int_min/int_max</span></div><div class=""><span class="" style="font-family: Verdana;"><br class=""></span></div><div class=""><font face="Verdana" class="">So no matter what semantic you pick, code generation will require extra instructions on one of the two architectures...</font></div><div class=""><font face="Verdana" class=""><br class=""></font></div><div class=""><font face="Verdana" class="">- Matthias</font></div><div class=""><br class=""></div></div><blockquote type="cite" class=""><div class=""><br class=""><div class="gmail_quote"><div dir="ltr" class="">On Mon, Nov 5, 2018 at 2:37 PM Finkel, Hal J. via llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org" class="">llvm-dev@lists.llvm.org</a>> wrote:<br class=""></div><blockquote class="gmail_quote" style="margin: 0px 0px 0px 0.8ex; border-left-width: 1px; border-left-style: solid; border-left-color: rgb(204, 204, 204); padding-left: 1ex;"><div text="#000000" bgcolor="#FFFFFF" class=""><br class=""><div class="m_9159391481063226909moz-cite-prefix">On 11/05/2018 07:26 AM, Nikita Popov via llvm-dev wrote:<br class=""></div><blockquote type="cite" class=""><div dir="ltr" class=""><div dir="ltr" class=""><div class="">Hi everyone!</div><div class=""><br class=""></div><div class="">The fptoui/fptosi instructions are currently specified to return a poison value if the rounded-towards-zero floating point number cannot be represented by the target integer type. The motivation for this behavior is that overflowing float to int casts in C are undefined behavior.</div><div class=""><br class=""></div><div class="">However, many newer languages prefer to have a float to integer cast that is well-defined for all input values. A commonly chosen semantic is to saturate towards the minimum and maximum values of the integer type, and represent NaN values as zero.</div></div></div></blockquote><br class="">I think that this is fine, motivationally, and we might even want a dedicated intrinsic if the IR needed to represent the lowering will, later in the pipeline, be difficult to pattern match. However, if you want the casts to be well defined, then you should define their behavior. "Do some fast thing" is not really a definition, and I don't believe that we should give target-independent constructs target-dependent behavior.<br class=""><br class=""> -Hal<br class=""><br class=""><blockquote type="cite" class=""><div dir="ltr" class=""><div dir="ltr" class=""><div class="">An extensive discussion of this issue for the Rust language can be found at<span class="Apple-converted-space"> </span><a href="https://github.com/rust-lang/rust/issues/10184" target="_blank" class="">https://github.com/rust-lang/rust/issues/10184</a>.<br class=""></div><div class=""><br class=""></div><div class="">Unfortunately, implementing this behavior in an efficient manner is not easy right now, because depending on the target architecture different instruction sequences need to be generated. On ARM the vcvt instruction directly exposes the desired saturation behavior. On X86 good instruction sequences vary depending on the size of the floating point number, and the size and signedness of the target integer type.</div><div class=""><br class=""></div><div class="">I think there are broadly three ways in which the current situation can be improved:<br class=""></div><div class=""><br class=""></div><div class="">1. Provide a fptoui/fptosi variant to produces target-specific values instead of a poison value for unrepresentable values. The result would be whatever is fastest for the given target.<br class=""></div><div class=""><br class=""></div><div class="">2. Provide an intrinsic for saturating floating point to int conversions, as described above.<br class=""></div><div class=""><br class=""></div><div class="">3. Provide an intrinsic for floating point to int conversions, which additionally indicates whether the value was representable, similarly to the existing XXX.with.overflow family of intrinsics.<br class=""></div><div class=""><br class=""></div><div class="">I think that point 1 is both the most pressing and the easiest to realize. This would resolve the immediate soundness problem in Rust (if not in a great way). Even if Rust specifies that float-to-int conversions are saturating we'd still want to support this kind of operation for performance reasons, and it would be preferable if performing a fast float-to-int conversion did not require dropping into unsafe code.</div><div class=""><br class=""></div><div class="">The way I would imagine this to work is that fptoui/fptosi gain a flag similar to add nsw/nuw -- let's call it "fptoui representable" for now. If the flag is not specified the return value for unrepresentable values is target-specific. If it is specified, the return value is poison. (Alternatively the meaning of the flag could be inverted.)</div><div class=""><br class=""></div><div class="">From a cursory inspection of the code, there should not be too many places that care about the presence of this flag. The main one is of course constant folding, but there are probably others (I could imagine that the Float2Int pass makes assumptions here, but haven't looked too carefully.)</div><div class=""><br class=""></div><div class="">Point 2 is also important, because specifying saturation as the default behavior for float-to-int casts is becoming increasingly common. This would need two new intrinsics, such as:<br class=""></div><div class=""><br class=""></div><div class="">iYY llvm.fptoui.sat.fXX.iYY(fXX %a)<div class="">iYY llvm.fptosi.sat.fXX.iYY(fXX %a)</div><div class=""><br class=""></div><div class="">There is some precedent here with the recently introduced llvm.sadd.sat and llvm.uadd.sat intrinsics for saturating integer addition. The wasm backend also has custom llvm.wasm.trunc.saturate intrinsics for this purpose.</div><div class=""><br class=""></div><div class="">These intrinsics would also need corresponding SelectionDAG nodes. A generic lowering would use a number of comparison (or min/max) instructions, while target-specific lowerings will be able to do better (e.g. single instruction on arm or wasm).<br class=""></div><div class=""><br class=""></div><div class="">Point 3 is less important. Having a "with overflow" intrinsic would allow to easily implement custom handling of unrepresentable values, e.g. to generate an error in debug builds. The intrinsics would go something like this:<br class=""></div><div class=""><br class=""></div><div class="">{iYY, i1} llvm.fptoui.with.overflow.fXX.iYY(fXX %a)</div><div class="">{iYY, i1} llvm.fptosi.with.overflow.fXX.iYY(fXX %a)<span class="Apple-converted-space"> </span><br class=""></div><div class=""><br class=""></div><div class="">If the overflow flag is true, the result could be specified to either be target-specific or undef.<br class=""></div><div class=""><br class=""></div><div class="">---</div><div class=""><br class=""></div><div class="">I would like to have some feedback on whether there is interest in improving this area, and in particular:</div><div class=""><br class=""></div><div class="">a) Whether introducing a flag to control poison vs target-specific value for fptoui/fptosi is reasonable. Looking through the language reference, it is somewhat unusual to have target-specific behavior for a fundamental instruction.</div><div class=""><br class=""></div><div class="">b) Whether introducing first-class saturating float-to-int cast intrinsics is reasonable.</div><div class=""><br class=""></div><div class="">Regards,<br class=""></div><div class="">Nikita<br class=""></div></div></div></div><br class=""><fieldset class="m_9159391481063226909mimeAttachmentHeader"></fieldset><br class=""><pre class="">_______________________________________________

LLVM Developers mailing list

<a class="m_9159391481063226909moz-txt-link-abbreviated" href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>

<a class="m_9159391481063226909moz-txt-link-freetext" href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" target="_blank">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a>

</pre></blockquote><br class=""><pre class="m_9159391481063226909moz-signature" cols="72">-- 

Hal Finkel

Lead, Compiler Technology and Programming Languages

Leadership Computing Facility

Argonne National Laboratory</pre></div>_______________________________________________<br class="">LLVM Developers mailing list<br class=""><a href="mailto:llvm-dev@lists.llvm.org" target="_blank" class="">llvm-dev@lists.llvm.org</a><br class=""><a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank" class="">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a><br class=""></blockquote></div>_______________________________________________<br class="">LLVM Developers mailing list<br class=""><a href="mailto:llvm-dev@lists.llvm.org" class="">llvm-dev@lists.llvm.org</a><br class=""><a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" class="">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a><br class=""></div></blockquote></div><br class="" style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;"><span style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none; float: none; display: inline !important;" class="">_______________________________________________</span><br style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;" class=""><span style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none; float: none; display: inline !important;" class="">LLVM Developers mailing list</span><br style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;" class=""><a href="mailto:llvm-dev@lists.llvm.org" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px;" class="">llvm-dev@lists.llvm.org</a><br style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;" class=""><a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px;" class="">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a></div></blockquote></div><br class=""></div></div></div></div></blockquote></div><br class=""></body></html>