Try to clarify semantics of fptrunc

Thu Sep 3 13:30:37 PDT 2015

On Thu, Sep 3, 2015 at 11:26 AM, Dan Liew <dan at su-root.co.uk> wrote:
>> How would that work for C/C++? I may be lacking creativity, but I
>> think clang would need to generate a built-in "rounding mode"
>> thread-local value,
>
> I was under the impression that the rounding mode was already part of
> thread local storage (at least for C11) which is retrievable with
> fegetround() and modified using fesetround().
> Although C's definition of rounding modes is poor (e.g. doesn't define
> how ties are handled when doing rounding to nearest AFAIK) and doesn't
> include all of IEEE 754's rounding modes.

Yes it's thread local in the language, but the storage is an
implementation detail which LLVM currently doesn't model (it's in the
current CPU state). Modeling it gives LLVM more information... and
bloats the IR. Also, "pure" and "no read" functions that do strict FP
aren't pure anymore!

In a way it would need to be modeled as flags are, which is to say
individually by each backend, with some magical "i1" that they have to
disambiguate.

>> and load from it before any truncation?
>
> How to actually handle this for codegen really depends on the kind of
> instructions you emit. For example for X86
>
> - If you use something like VCVTPS2PH the rounding mode is specified
> by one of the operands (one of the values also allows the state of
> MXCSR to be used). This is
> a good fit for a version of ``fptrunc`` with a rounding mode operand.
> Most people don't care about rounding modes but when you use half
> precision (IEEE-754 binary16)
> rounding makes a huge difference. Although not available in C or C++
> they are available in languages like OpenCL C.
>
> - If you use something like CVTSD2SS to do the conversion the rounding
> mode is not part of the operand and is instead part of the CPU's state
> (set by MXCSR register).
> Translating a version of ``fptrunc`` with a rounding mode operand
> would be problematic for codegen because the only to always faithfully
> do a conversion using the correct rounding mode would
> be two emit two instructions, the first changing the state of the
> MXCSR register to set the required rounding mode, the second doing the
> actual conversion. This would be very sub-optimal code because most of
> the time people don't change rounding modes. There might be ways of
> eliminating the need to emit the first instruction if it is statically
> possible to prove that the rounding mode has not been changed since
> being set by a previous instruction but in general I think this would
> be difficult to tackle.
>
> LLVM IR doesn't really have a concept of rounding mode for floating
> point AFAIK but to me this seems problematic if you really care about
> numerical accuracy.
>
> Perhaps a good first step would be to add a rounding mode
> **immediate** operand (branching for different rounding modes in
> codegen does not seem desirable) to ``fptrunc`` that can be one of the
> following values
>
> - undef
> - RU (Round up, i.e. to +infinity)
> - RD (Round down, i.e. to -infinity)
> - RZ (Round to zero, note this is the same as truncation)
> - RNE (Round to nearest, ties to even)
> - RNA (Round to nearest, ties to away from zero)
>
> These are all the rounding modes defined by IEEE754, plus one extra.
> The "undef" rounding mode means that the target architecture can use
> any rounding mode it likes (even one that doesn't conform to a IEEE754
> rounding mode!). For C/C++ Clang could set the extra ``fptrunc``
> operand to "undef" which basically means the current LLVM backends can
> emit whatever they emit now (on X86 probably uses rounding mode
> specified by MXCSR register). Other language front-ends however (e.g.
> OpenCL C) could specify a different rounding mode when emitting
> ``fptrunc`` and then the LLVM backends must emit instructions that
> will use that rounding mode or throw an error if that rounding mode
> cannot be implemented by that architecture (or by the software
> floating point implementation used by the target if it uses one).
>
> I'm not really sure how interested people would be in a change like
> this. It would quite invasive and probably not that well tested
> because the main source of LLVM IR right now is Clang and C/C++
> doesn't expose rounding modes very nicely.

I'm not sure I understand your proposal... but I want to make sure
you're not suggesting that users' functions be duplicated :-)