[cfe-dev] FE_INEXACT being set for an exact conversion from float to unsigned long long
Michael Clark via cfe-dev
cfe-dev at lists.llvm.org
Tue Apr 18 19:10:37 PDT 2017
> On 19 Apr 2017, at 1:14 PM, Tim Northover <t.p.northover at gmail.com> wrote:
>
> On 18 April 2017 at 15:54, Michael Clark via cfe-dev
> <cfe-dev at lists.llvm.org> wrote:
>> The only way towards completing a milestone is via fixing a number of small issues along
>> the way…
>
> I believe there's more to it than that. None of LLVM's optimizations
> are aware of this extra side-channel of information (with possible
> exceptions like avoiding speculating fdiv because of unavoidable
> exceptions).
>
> From what I remember, the real proposal is to replace all
> floating-point IR with intrinsics when FENV_ACCESS is on, which the
> optimizers by default won't have a clue about and will treat
> conservatively (essentially like they're modifying external memory).
>
> So be careful with drawing conclusions from small snippets; you're
> probably not seeing the full range of LLVM's behaviour.
Yes. I’m sure.
It reproduces with just the cast on its own: https://godbolt.org/g/myUoL2 <https://godbolt.org/g/myUoL2>
It appears to be in the LLVM lowering of the fptoui intrinsic so it must MC layer optimisations.
; Function Attrs: noinline nounwind uwtable
define i64 @_Z7fcvt_luf(float %f) #0 {
%1 = alloca float, align 4
store float %f, float* %1, align 4
%2 = load float, float* %1, align 4
%3 = fptoui float %2 to i64
ret i64 %3
}
GCC performs a comparison with ucomiss and branches whereas Clang computes both forms and predicates the result using a conditional move. One of the conversions obviously is setting the INEXACT MXCSR flag.
Clang lowering (inexact set when result is exact):
fcvt_lu(float):
movss xmm1, dword ptr [rip + .LCPI1_0] # xmm1 = mem[0],zero,zero,zero
movaps xmm2, xmm0
subss xmm2, xmm1
cvttss2si rax, xmm2
movabs rcx, -9223372036854775808
xor rcx, rax
cvttss2si rax, xmm0
ucomiss xmm0, xmm1
cmovae rax, rcx
ret
GCC lowering (sets flags correctly):
fcvt_lu(float):
ucomiss xmm0, DWORD PTR .LC0[rip]
jnb .L4
cvttss2si rax, xmm0
ret
.L4:
subss xmm0, DWORD PTR .LC0[rip]
movabs rdx, -9223372036854775808
cvttss2si rax, xmm0
xor rax, rdx
ret
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20170419/085240e7/attachment.html>
More information about the cfe-dev
mailing list