Try to clarify semantics of fptrunc

Thu Sep 3 11:26:01 PDT 2015

> How would that work for C/C++? I may be lacking creativity, but I
> think clang would need to generate a built-in "rounding mode"
> thread-local value,

I was under the impression that the rounding mode was already part of
thread local storage (at least for C11) which is retrievable with
fegetround() and modified using fesetround().
Although C's definition of rounding modes is poor (e.g. doesn't define
how ties are handled when doing rounding to nearest AFAIK) and doesn't
include all of IEEE 754's rounding modes.

> and load from it before any truncation?

How to actually handle this for codegen really depends on the kind of
instructions you emit. For example for X86

- If you use something like VCVTPS2PH the rounding mode is specified
by one of the operands (one of the values also allows the state of
MXCSR to be used). This is
a good fit for a version of ``fptrunc`` with a rounding mode operand.
Most people don't care about rounding modes but when you use half
precision (IEEE-754 binary16)
rounding makes a huge difference. Although not available in C or C++
they are available in languages like OpenCL C.

- If you use something like CVTSD2SS to do the conversion the rounding
mode is not part of the operand and is instead part of the CPU's state
(set by MXCSR register).
Translating a version of ``fptrunc`` with a rounding mode operand
would be problematic for codegen because the only to always faithfully
do a conversion using the correct rounding mode would
be two emit two instructions, the first changing the state of the
MXCSR register to set the required rounding mode, the second doing the
actual conversion. This would be very sub-optimal code because most of
the time people don't change rounding modes. There might be ways of
eliminating the need to emit the first instruction if it is statically
possible to prove that the rounding mode has not been changed since
being set by a previous instruction but in general I think this would
be difficult to tackle.

LLVM IR doesn't really have a concept of rounding mode for floating
point AFAIK but to me this seems problematic if you really care about
numerical accuracy.

Perhaps a good first step would be to add a rounding mode
**immediate** operand (branching for different rounding modes in
codegen does not seem desirable) to ``fptrunc`` that can be one of the
following values

- undef
- RU (Round up, i.e. to +infinity)
- RD (Round down, i.e. to -infinity)
- RZ (Round to zero, note this is the same as truncation)
- RNE (Round to nearest, ties to even)
- RNA (Round to nearest, ties to away from zero)

These are all the rounding modes defined by IEEE754, plus one extra.
The "undef" rounding mode means that the target architecture can use
any rounding mode it likes (even one that doesn't conform to a IEEE754
rounding mode!). For C/C++ Clang could set the extra ``fptrunc``
operand to "undef" which basically means the current LLVM backends can
emit whatever they emit now (on X86 probably uses rounding mode
specified by MXCSR register). Other language front-ends however (e.g.
OpenCL C) could specify a different rounding mode when emitting
``fptrunc`` and then the LLVM backends must emit instructions that
will use that rounding mode or throw an error if that rounding mode
cannot be implemented by that architecture (or by the software
floating point implementation used by the target if it uses one).

I'm not really sure how interested people would be in a change like
this. It would quite invasive and probably not that well tested
because the main source of LLVM IR right now is Clang and C/C++
doesn't expose rounding modes very nicely.

Thanks,
Dan.