[llvm-dev] llvm.rint specification

Fri Nov 16 17:55:54 PST 2018

There is no particular test that I am worried about.  I just think that lowering llvm.rint to VROUND with an 0x4 (MXCSR rounding) control on x86 conflicts with llvm.rint’s round-to-nearest specification.  I guess it is OK with the default FPEnv rounding *assumed* to be round-to-nearest.  Thank you for the explanation.

Thanks,
Slava

From: Cameron McInally [mailto:cameron.mcinally at nyu.edu]
Sent: Wednesday, November 14, 2018 11:33 AM
To: Zakharin, Vyacheslav P <vyacheslav.p.zakharin at intel.com>
Cc: LLVM Developers Mailing List <llvm-dev at lists.llvm.org>
Subject: Re: [llvm-dev] llvm.rint specification

On Wed, Nov 14, 2018 at 1:52 PM Zakharin, Vyacheslav P <vyacheslav.p.zakharin at intel.com<mailto:vyacheslav.p.zakharin at intel.com>> wrote:
Hi Cameron,

Thank you for the comments, but I am confused even more now ☺

> llvm.rint won't honor the FPEnv in all cases. It falls under the default FPEnv
The default FPEnv section specifies round-to-nearest rounding mode.  In this case, how is it correct to map rint() user call to llvm.rint intrinsic call?

So the default FPEnv section *assumes* round-to-nearest. I think there's a subtle difference there. The intrinsics are optimized assuming that the FPEnv won't change. The user *can* change the FPEnv around intrinsic uses in this mode. Although, it's not guaranteed that optimizations won't change the intended behavior.

The constrained intrinsics offer (or I should say "will offer") more guarantees about optimizations and FPEnv safety.

That all said, I don't think there are any guarantees that an explicit libm call would honor the FPEnv either. The opaqueness of the libm call might give a modicum of security when interacting with the FPEnv, but I'm not aware of any hard guarantees.

Is there a particular llvm.rint test case that you're worried about? I'm under the impression that if a target expands this intrinsic, then it will fall back to the libm call. I also checked x86 and it produces VROUND with an 0x4 immediate, which should emulate libm's results (or pretty close to it). That is, of course, assuming that the optimizer didn't move the call around functions that could change the FPEnv.

If this is just a temporary inconsistency, and the long term solution is to map rint() user call to llvm.experimental.constrained.rint intrinsic call, I am OK with this.  If it is not, then I must be missing something.

I joined this project after the design was already laid down, so I don't want to speak for everyone. But IMO, yes, we would merge both intrinsics in the future. I do not see a clear path to optimizing the llvm.experimental.constrained.xxx intrinsics, without a tremendous amount of churn, under the current design.

I am sorry if this has been described somewhere in the RFC for the constrained FP intrinsics, which I haven’t read yet.

No apologies necessary. I didn't know about this until your email. ;)

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20181117/b80c511e/attachment.html>