[llvm-dev] Safe fptoui/fptosi casts

Mon Nov 5 05:26:44 PST 2018

Hi everyone!

The fptoui/fptosi instructions are currently specified to return a poison
value if the rounded-towards-zero floating point number cannot be
represented by the target integer type. The motivation for this behavior is
that overflowing float to int casts in C are undefined behavior.

However, many newer languages prefer to have a float to integer cast that
is well-defined for all input values. A commonly chosen semantic is to
saturate towards the minimum and maximum values of the integer type, and
represent NaN values as zero. An extensive discussion of this issue for the
Rust language can be found at https://github.com/rust-lang/rust/issues/10184
.

Unfortunately, implementing this behavior in an efficient manner is not
easy right now, because depending on the target architecture different
instruction sequences need to be generated. On ARM the vcvt instruction
directly exposes the desired saturation behavior. On X86 good instruction
sequences vary depending on the size of the floating point number, and the
size and signedness of the target integer type.

I think there are broadly three ways in which the current situation can be
improved:

1. Provide a fptoui/fptosi variant to produces target-specific values
instead of a poison value for unrepresentable values. The result would be
whatever is fastest for the given target.

2. Provide an intrinsic for saturating floating point to int conversions,
as described above.

3. Provide an intrinsic for floating point to int conversions, which
additionally indicates whether the value was representable, similarly to
the existing XXX.with.overflow family of intrinsics.

I think that point 1 is both the most pressing and the easiest to realize.
This would resolve the immediate soundness problem in Rust (if not in a
great way). Even if Rust specifies that float-to-int conversions are
saturating we'd still want to support this kind of operation for
performance reasons, and it would be preferable if performing a fast
float-to-int conversion did not require dropping into unsafe code.

The way I would imagine this to work is that fptoui/fptosi gain a flag
similar to add nsw/nuw -- let's call it "fptoui representable" for now. If
the flag is not specified the return value for unrepresentable values is
target-specific. If it is specified, the return value is poison.
(Alternatively the meaning of the flag could be inverted.)

>From a cursory inspection of the code, there should not be too many places
that care about the presence of this flag. The main one is of course
constant folding, but there are probably others (I could imagine that the
Float2Int pass makes assumptions here, but haven't looked too carefully.)

Point 2 is also important, because specifying saturation as the default
behavior for float-to-int casts is becoming increasingly common. This would
need two new intrinsics, such as:

iYY llvm.fptoui.sat.fXX.iYY(fXX %a)
iYY llvm.fptosi.sat.fXX.iYY(fXX %a)

There is some precedent here with the recently introduced llvm.sadd.sat and
llvm.uadd.sat intrinsics for saturating integer addition. The wasm backend
also has custom llvm.wasm.trunc.saturate intrinsics for this purpose.

These intrinsics would also need corresponding SelectionDAG nodes. A
generic lowering would use a number of comparison (or min/max)
instructions, while target-specific lowerings will be able to do better
(e.g. single instruction on arm or wasm).

Point 3 is less important. Having a "with overflow" intrinsic would allow
to easily implement custom handling of unrepresentable values, e.g. to
generate an error in debug builds. The intrinsics would go something like
this:

{iYY, i1} llvm.fptoui.with.overflow.fXX.iYY(fXX %a)
{iYY, i1} llvm.fptosi.with.overflow.fXX.iYY(fXX %a)

If the overflow flag is true, the result could be specified to either be
target-specific or undef.

---

I would like to have some feedback on whether there is interest in
improving this area, and in particular:

a) Whether introducing a flag to control poison vs target-specific value
for fptoui/fptosi is reasonable. Looking through the language reference, it
is somewhat unusual to have target-specific behavior for a fundamental
instruction.

b) Whether introducing first-class saturating float-to-int cast intrinsics
is reasonable.

Regards,
Nikita
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20181105/caa23e8a/attachment.html>