[cfe-dev] FE_INEXACT being set for an exact conversion from float to unsigned long long

Michael Clark via cfe-dev cfe-dev at lists.llvm.org
Tue Apr 18 19:10:37 PDT 2017


> On 19 Apr 2017, at 1:14 PM, Tim Northover <t.p.northover at gmail.com> wrote:
> 
> On 18 April 2017 at 15:54, Michael Clark via cfe-dev
> <cfe-dev at lists.llvm.org> wrote:
>> The only way towards completing a milestone is via fixing a number of small issues along
>> the way…
> 
> I believe there's more to it than that. None of LLVM's optimizations
> are aware of this extra side-channel of information (with possible
> exceptions like avoiding speculating fdiv because of unavoidable
> exceptions).
> 
> From what I remember, the real proposal is to replace all
> floating-point IR with intrinsics when FENV_ACCESS is on, which the
> optimizers by default won't have a clue about and will treat
> conservatively (essentially like they're modifying external memory).
> 
> So be careful with drawing conclusions from small snippets; you're
> probably not seeing the full range of LLVM's behaviour.


Yes. I’m sure.

It reproduces with just the cast on its own: https://godbolt.org/g/myUoL2 <https://godbolt.org/g/myUoL2>

It appears to be in the LLVM lowering of the fptoui intrinsic so it must MC layer optimisations.

; Function Attrs: noinline nounwind uwtable
define i64 @_Z7fcvt_luf(float %f) #0 {
  %1 = alloca float, align 4
  store float %f, float* %1, align 4
  %2 = load float, float* %1, align 4
  %3 = fptoui float %2 to i64
  ret i64 %3
}

GCC performs a comparison with ucomiss and branches whereas Clang computes both forms and predicates the result using a conditional move. One of the conversions obviously is setting the INEXACT MXCSR flag.

Clang lowering (inexact set when result is exact):

fcvt_lu(float):
        movss   xmm1, dword ptr [rip + .LCPI1_0] # xmm1 = mem[0],zero,zero,zero
        movaps  xmm2, xmm0
        subss   xmm2, xmm1
        cvttss2si       rax, xmm2
        movabs  rcx, -9223372036854775808
        xor     rcx, rax
        cvttss2si       rax, xmm0
        ucomiss xmm0, xmm1
        cmovae  rax, rcx
        ret

GCC lowering (sets flags correctly):

fcvt_lu(float):
        ucomiss xmm0, DWORD PTR .LC0[rip]
        jnb     .L4
        cvttss2si       rax, xmm0
        ret
.L4:
        subss   xmm0, DWORD PTR .LC0[rip]
        movabs  rdx, -9223372036854775808
        cvttss2si       rax, xmm0
        xor     rax, rdx
        ret
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20170419/085240e7/attachment.html>


More information about the cfe-dev mailing list