[llvm-dev] Should rint and nearbyint be always constrained?

Serge Pavlov via llvm-dev llvm-dev at lists.llvm.org
Tue Mar 3 09:59:16 PST 2020


>
> One concern with replacing llvm.rint and llvm.nearbyint with
> llvm.roundeven makes it difficult to turn back into a libcall if the
> backend doesn't have an instruction for it. You can't just call the
> roundeven library function since that wouldn't exist in older libm
> implementations. So ideally you would know which function was originally
> used in the user code and call that.


Yes, you are right. Such optimization at IR level probably does not make
sense.

Thanks,
--Serge


On Tue, Mar 3, 2020 at 11:41 PM Craig Topper <craig.topper at gmail.com> wrote:

> Note, EVEX static rounding forces suppress all exceptions. You can't have
> static rounding with exceptions.
>
> We're also talking about making the vector predicated floating point
> intrinsics that Simon Moll is working on support both strict and non-strict
> using operand bundles. So you're right we could probably merge constrained
> and non-constrained versions of the existing intrinsics.
>
> One concern with replacing llvm.rint and llvm.nearbyint with
> llvm.roundeven makes it difficult to turn back into a libcall if the
> backend doesn't have an instruction for it. You can't just call the
> roundeven library function since that wouldn't exist in older libm
> implementations. So ideally you would know which function was originally
> used in the user code and call that.
>
> ~Craig
>
>
> On Tue, Mar 3, 2020 at 8:23 AM Serge Pavlov via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>> The only issue I see is that since we also assume FP operations have no
>>> side effects by default there is no difference between llvm.rint and
>>> llvm.nearbyint. I wouldn’t have a problem with dropping llvm.rint
>>> completely.
>>
>>
>> The forthcoming C standard (
>> http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2454.pdf, 7.12.9.8)
>> defines new function, `roundeven`, which implements IEEE-754 operation
>> `roundToIntegralTiesToEven`. When corresponding intrinsic will be
>> implemented (I am working on such patch), llvm.rint and llvm.nearbyint will
>> identical to llvm.roundeven in default environment and both can be dropped.
>> We'll end up with a funny situation, there are constrained intrinsics
>> (experimental!) but not corresponding 'usual' intrinsics. This demonstrates
>> that splitting an operation into constrained and non-constrained variants
>> does not work in the case of `rint` and `nearbyint`.
>>
>> As for the target-specific intrinsics, you are correct that we need a
>>> plan for that.
>>
>>
>> When making such plan we should keep in mind that some targets encode
>> rounding mode in instructions, rather than in some hardware register. In
>> this case "floating point environment" is an attribute of particular
>> instruction. By the way, X86 target also has such property: EVEX prefix
>> allows static rounding support or suppress-all-exceptions. Such properties
>> are naturally modeled with metadata operands but splitting to constrained
>> and non-constrained variants makes little sense.
>>
>> My suggestion would be that we should set the strictfp attribute on these
>>> intrinsics and provide the rounding mode and exception behavior arguments
>>> using an operand bundle.
>>
>>
>> This is an interesting variant. IIUC it means that FP environment is a
>> property of a call rather that an instruction? That is some call may have
>> rounding mode argument and another call of the same intrinsic may have not?
>> It would the third way to express FP environment, together with the current
>> per-intrinsic way and the rejected per-basic-block one. I wonder if we can
>> model “inaccessibleMemOnly” or something like that using this way. The main
>> justification of splitting an intrinsic to constrained and non-constrained
>> variants is that one has side effect and the other does not. If we could
>> deliberately assign this property to a particular call, we could eventually
>> merge constrained and non-constrained intrinsics.
>>
>> It’s probably best to say in the documentation that the llvm.nearbyint
>>> and llvm.rint functions “assume the default rounding mode, roundToNearest”.
>>> This will allow the optimizer to transform them as if they were rounding to
>>> nearest without requiring backends to use an encoding that enforces
>>> roundToNearest as the rounding mode for these operations.
>>
>>
>> Optimizer could make the same optimization with constrained nearbyint and
>> rint, replacing them with llvm.roundeven, it is knows that the environment
>> is default.
>>
>> Also, we should take care to document the non-constrained forms of these
>>> intrinsics in a way that makes clear that we are “assuming” and not
>>> requiring that the operation has no side effects.
>>
>>
>> What non-constrained forms of rint/nearbyint can be used for? They are do
>> the same job as llvm.roundeven does. They are useless. These intrinsics
>> were introduced to represent C library functions rint/nearbyint, but the
>> standard explicitly states that the result of either depends on current
>> rounding mode. So these intrinsics should not be split into constrained and
>> non-constrained forms, only the form that is ordered relative to other
>> operations accessing FP environment may exist.
>>
>> Here are some suggested wordings for the “Semantics” section of the
>>> langref for these functions:
>>
>>
>> Thank you!
>>
>> I’d like to also say that these intrinsics can be lowered to the
>>> corresponding libm functions, but I’m not sure all libm implementations
>>> meet the requirements above.
>>
>>
>> I think we should reference C standard rather than particular library.
>> For example, semantics of roundeven:
>>
>> This function implements IEEE-754 operation
>> ``roundToIntegralTiesToEven``. It
>> also behaves in the same way as C standard function ``roundeven``, except
>> that
>> it does not raise floating point exceptions.
>>
>>
>> Thanks,
>> --Serge
>>
>>
>> On Tue, Mar 3, 2020 at 7:32 PM Hanna Kruppe <hanna.kruppe at gmail.com>
>> wrote:
>>
>>> Hi Andy,
>>>
>>> On Mon, 2 Mar 2020 at 23:59, Kaylor, Andrew via llvm-dev <
>>> llvm-dev at lists.llvm.org> wrote:
>>>
>>>> Some clarification after getting feedback from Craig Topper….
>>>>
>>>>
>>>>
>>>> It’s probably best to say in the documentation that the llvm.nearbyint
>>>> and llvm.rint functions “assume the default rounding mode, roundToNearest”.
>>>> This will allow the optimizer to transform them as if they were rounding to
>>>> nearest without requiring backends to use an encoding that enforces
>>>> roundToNearest as the rounding mode for these operations. On modern x86
>>>> targets we can encode it either way, but it seems more consistent to
>>>> continue using the current encoding which tells the processor to use the
>>>> current rounding mode. For other targets (including cases where x86 is
>>>> forced to use x87 instructions), it may be much easier to leave this at the
>>>> discretion of the backend.
>>>>
>>>>
>>>>
>>>> Also, we should take care to document the non-constrained forms of
>>>> these intrinsics in a way that makes clear that we are “assuming” and not
>>>> requiring that the operation has no side effects.
>>>>
>>>
>>> Note that these aspects are shared by most other FP operations and
>>> already discussed in the LangRef section <
>>> https://llvm.org/docs/LangRef.html#floating-point-environment> which
>>> currently reads:
>>>
>>> > The default LLVM floating-point environment assumes that
>>> floating-point instructions do not have side effects. Results assume the
>>> round-to-nearest rounding mode. No floating-point exception state is
>>> maintained in this environment. Therefore, there is no attempt to create or
>>> preserve invalid operation (SNaN) or division-by-zero exceptions.
>>> >
>>> >  The benefit of this exception-free assumption is that floating-point
>>> operations may be speculated freely without any other fast-math relaxations
>>> to the floating-point model.
>>> >
>>> > Code that requires different behavior than this should use the
>>> Constrained Floating-Point Intrinsics.
>>>
>>> Your explanation of the implications for optimizers and backends seems
>>> like a useful addition to this section. As many intrinsics (not just
>>> nearbyint/rint) and instructions (fadd, fmul, etc.) behave this way, I
>>> think it would be more useful to consolidate all the information into this
>>> section and reference it from the relevant "Semantics" sections.
>>>
>>> While we're on it, let me point out the consequences of breaking these
>>> assumptions are still fuzzy even with your clarifications. In general, when
>>> a compiler "assumes" something that is not actually true, it's useful to
>>> specify what exactly happens when the assumption is actually false, e.g.
>>> the result is an undefined value (undef/poison), or a non-deterministic
>>> choice is made (e.g. branching on poison, at the moment), or Undefined
>>> Behavior happens. In this sense, I wonder what should happen when the
>>> assumptions about rounding mode and FP exception state are broken? If it's
>>> going to take broader discussion to agree on an answer, that's probably out
>>> of scope for this thread, but perhaps there's a clear answer that just
>>> wasn't written down so far?
>>>
>>> For the constrained version of nearbyint, we will require that the
>>>> inexact exception is not raised (to be consistent with iEEE 754-2019’s
>>>> roundToIntegral operations) and for the constrained version of rint we will
>>>> require that the inexact exception is raised (to be consistent with iEEE
>>>> 754-2019’s roundToIntegralExact operation), but for the non-constrained
>>>> forms it should be clear that the backend is free to implement this in the
>>>> most efficient way possible, without regard to FP exception behavior.
>>>>
>>>>
>>>>
>>>> Finally, I see now the problem with documenting these in terms of the
>>>> IEEE operations, given that IEEE 754-2019 doesn’t describe an operation
>>>> that uses the current rounding mode without knowing what that is. I see
>>>> this as a problem of documentation rather than one that presents any
>>>> difficulty for the implementation.
>>>>
>>>
>>> I'm not quite sure what you mean by "uses the current rounding without
>>> knowing what it is" --are you referring to the wobbly uncertainty caused by
>>> optimizations assuming one rounding mode but runtime code possibly using a
>>> different dynamic rounding mode? If so, explicitly defining what happens
>>> when dynamic and "assumed" rounding mode don't match (see above) also
>>> addresses this problem. Then the operations can be described like this:
>>>
>>> > If a rounding mode is assumed [RNE for non-constrained intrinsic or
>>> roundingMode argument != fpround.dynamic] and the current dynamic rounding
>>> mode differs from the assumed rounding mode, [pick one: behavior is
>>> undefined / result is poison / ...]. Otherwise, X operation is performed
>>> with the current dynamic rounding mode [which equals the statically assumed
>>> rounding mode if this clause applies].
>>>
>>> Best regards,
>>> Hanna
>>>
>>>
>>>> Here are some suggested wordings for the “Semantics” section of the
>>>> langref for these functions:
>>>>
>>>>
>>>>
>>>> llvm.nearbyint::semantics
>>>>
>>>>
>>>>
>>>> This function returns the same value as one of the IEEE 754-2019
>>>> roundToIntegral operations using the current rounding mode. The optimizer
>>>> may assume that actual rounding mode is roundToNearest (IEEE 754:
>>>> roundTiesToEven), but backends may encode this operation either using that
>>>> rounding mode explicitly or using the dynamic rounding mode from the
>>>> floating point environment. The optimizer may assume that the operation has
>>>> no side effects and raises no FP exceptions, but backends may encode this
>>>> operation using either instructions that raise exceptions or instructions
>>>> that do not. The FP exceptions are assumed to be ignored.
>>>>
>>>>
>>>>
>>>> llvm.rint (delete, or identical semantics to llvm.nearbyint)
>>>>
>>>>
>>>>
>>>> llvm.experimental.constrained.nearbyint::semantics
>>>>
>>>>
>>>>
>>>> This function returns the same value as one of the IEEE 754-2019
>>>> roundToIntegral operations. If the roundingMode argument is
>>>> fpround.dynamic, the behavior corresponds to whichever of the
>>>> roundToIntegral operations matches the dynamic rounding mode when the
>>>> operation is executed. The optimizer may not assume any rounding mode in
>>>> this case, and backends must encode the operation in a way that uses the
>>>> dynamic rounding mode. Otherwise, the rounding mode may be assumed to be
>>>> that described by the roundingMode argument and backends may either use
>>>> instructions that encode that rounding mode explicitly or use the current
>>>> rounding mode from the FP environment.
>>>>
>>>>
>>>>
>>>> The optimizer may assume that this operation does not raise the inexact
>>>> exception when the return value differs from the input value, and if the
>>>> exceptionBehavior argument is not fpexcept.ignore, the backend must encode
>>>> this operation using instructions that guarantee that the inexact exception
>>>> is not raised. If the exceptionBehavior argument is fpexcept.ignore,
>>>> backends may encode this operation using either instructions that raise
>>>> exceptions or instructions that do not.
>>>>
>>>>
>>>>
>>>> llvm.experimental.constrained.rint::semantics
>>>>
>>>>
>>>>
>>>> This function returns the same value as the IEEE 754-2019
>>>> roundToIntegralExact operation. If the roundingMode argument is
>>>> fpround.dynamic, the behavior uses to the dynamic rounding mode when the
>>>> operation is executed. The optimizer may not assume any rounding mode in
>>>> this case, and backends must encode the operation in a way that uses the
>>>> dynamic rounding mode. Otherwise, the rounding mode may be assumed to be
>>>> that described by the roundingMode argument and backends may either use
>>>> instructions that encode that rounding mode explicitly or use the current
>>>> rounding mode from the FP environment.
>>>>
>>>> If the exceptionBehavior argument is not fpexcept.ignore, the optimizer
>>>> must assume that this operation will raise the inexact exception when the
>>>> return value differs from the input value and the backend must encode this
>>>> operation using instructions that guarantee that the inexact exception is
>>>> raised in that case. If the exceptionBehavior argument is fpexcept.ignore,
>>>> backends may encode this operation using either instructions that raise
>>>> exceptions or instructions that do not.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> I’d like to also say that these intrinsics can be lowered to the
>>>> corresponding libm functions, but I’m not sure all libm implementations
>>>> meet the requirements above.
>>>>
>>>>
>>>>
>>>> -Andy
>>>>
>>>>
>>>>
>>>> *From:* llvm-dev <llvm-dev-bounces at lists.llvm.org> *On Behalf Of *Kaylor,
>>>> Andrew via llvm-dev
>>>> *Sent:* Monday, March 02, 2020 9:56 AM
>>>> *To:* Serge Pavlov <sepavloff at gmail.com>; Ulrich Weigand <
>>>> Ulrich.Weigand at de.ibm.com>
>>>> *Cc:* LLVM Developers <llvm-dev at lists.llvm.org>
>>>> *Subject:* Re: [llvm-dev] Should rint and nearbyint be always
>>>> constrained?
>>>>
>>>>
>>>>
>>>> I agree with Ulrich. The default behavior of LLVM IR is to assume that
>>>> the roundToNearest is the current rounding mode everywhere. This
>>>> corresponds to the C standard, which says that the user may only modify the
>>>> floating point environment if fenv access is enabled. In the latest version
>>>> of the C standard, pragmas are added which can change the rounding mode for
>>>> a region, and if these are implemented in clang the constrained versions of
>>>> all FP operations should be used. However, outside of regions where fenv
>>>> access is enabled either by pragma or command line option, we are free to
>>>> assume that the current rounding mode is the default rounding mode.
>>>>
>>>>
>>>>
>>>> So, llvm.rint and llvm.nearbyint (the non-constrained versions) can be
>>>> specifically documented as performing their operation according to
>>>> roundToNearest and clang can use them in the default case for the
>>>> corresponding libm functions, and llvm.experimental.constrained.rint and
>>>> llvm.experimental.constrained.nearbyint can be documented as using the
>>>> current rounding mode.
>>>>
>>>>
>>>>
>>>> The only issue I see is that since we also assume FP operations have no
>>>> side effects by default there is no difference between llvm.rint and
>>>> llvm.nearbyint. I wouldn’t have a problem with dropping llvm.rint
>>>> completely.
>>>>
>>>>
>>>>
>>>> As for the target-specific intrinsics, you are correct that we need a
>>>> plan for that. I have given it some thought, but nothing is currently
>>>> implemented. My suggestion would be that we should set the strictfp
>>>> attribute on these intrinsics and provide the rounding mode and exception
>>>> behavior arguments using an operand bundle. We do still need some way to
>>>> handle the side effects. My suggestion here is to add some new attribute
>>>> that means “no side effects” in the absence of the strictfp attribute and
>>>> something similar to “inaccessibleMemOnly” in the presence of strictfp.
>>>>
>>>>
>>>>
>>>> We could make the new attribute less restrictive than
>>>> inaccessibleMemOnly in that it only really needs to act as a barrier
>>>> relative to other things that are accessing the fp environment. I believe
>>>> Ulrich suggested this to me at the last LLVM Developer Meeting.
>>>>
>>>>
>>>>
>>>> -Andy
>>>>
>>>>
>>>>
>>>> *From:* Serge Pavlov <sepavloff at gmail.com>
>>>> *Sent:* Monday, March 02, 2020 8:10 AM
>>>> *To:* Ulrich Weigand <Ulrich.Weigand at de.ibm.com>
>>>> *Cc:* Kaylor, Andrew <andrew.kaylor at intel.com>; Cameron McInally <
>>>> cameron.mcinally at nyu.edu>; Kevin Neal <kevin.neal at sas.com>; LLVM
>>>> Developers <llvm-dev at lists.llvm.org>
>>>> *Subject:* Re: Should rint and nearbyint be always constrained?
>>>>
>>>>
>>>>
>>>> I'm not sure why this is an issue.  Yes, these two intrinsics depend
>>>> on the current rounding mode according to the C standard, and yes,
>>>> LLVM in default mode assumes that the current rounding mode is the
>>>> default rounding mode.  But the same holds true for many other
>>>> intrinsics and even the arithmetic IR operations like add.
>>>>
>>>>
>>>>
>>>> Any other intrinsic, like `floor`, `round` etc has meaning at default
>>>> rounding mode. But use of `rint` or `nearbyint` in default FP environment
>>>> is strange, `roundeven` can be used instead. We could use more general
>>>> intrinsics in all cases, as the special case of default environment is not
>>>> of practical interest.
>>>>
>>>>
>>>>
>>>> There is another reason for special handling. Set of intrinsics
>>>> includes things like `x86_sse_cvtss2si`. It is unlikely that all of them
>>>> eventually get constrained counterpart. It looks more natural that such
>>>> intrinsics are defined as accessing FP environment and can be optimized if
>>>> the latter is default. These two intrinsics could be a good model for such
>>>> cases. IIUC, splitting entities into constrained or non-constrained is a
>>>> temporary solution, ideally they will merge into one entity. We could do it
>>>> for some intrinsics now.
>>>>
>>>>
>>>>
>>>> Thanks,
>>>> --Serge
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Mon, Mar 2, 2020 at 8:58 PM Ulrich Weigand <
>>>> Ulrich.Weigand at de.ibm.com> wrote:
>>>>
>>>> Serge Pavlov <sepavloff at gmail.com> wrote on 02.03.2020 14:38:48:
>>>>
>>>> > This approach has issues when applied to the intrinsics `rint` and
>>>> > `nearbyint`. Value returned by either of these intrinsics depends on
>>>> > current rounding mode. If they are considered as operation in
>>>> > default environment, they would round only to nearest. It is by far
>>>> > not the meaning of the standard C functions that these intrinsics
>>>> represent.
>>>>
>>>> I'm not sure why this is an issue.  Yes, these two intrinsics depend
>>>> on the current rounding mode according to the C standard, and yes,
>>>> LLVM in default mode assumes that the current rounding mode is the
>>>> default rounding mode.  But the same holds true for many other
>>>> intrinsics and even the arithmetic IR operations like add.
>>>>
>>>> If you want to stop clang from making the default rounding mode
>>>> assumption, you need to use the -frounding-math option (or one
>>>> of its equivalents), which will cause clang to emit the corresponding
>>>> constrained intrinsics instead, for those two as well all other
>>>> affected intrinsics.
>>>>
>>>> I don't see why it would make sense to add another special case
>>>> just for those two intrinsics ...
>>>>
>>>>
>>>> Bye,
>>>> Ulrich
>>>>
>>>> _______________________________________________
>>>> LLVM Developers mailing list
>>>> llvm-dev at lists.llvm.org
>>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>
>>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200304/fe9cf4d8/attachment-0001.html>


More information about the llvm-dev mailing list