[llvm-dev] FENV_ACCESS and floating point LibFunc calls

Thu May 11 14:06:04 PDT 2017

Sounds like the select lowering issue is definitely separate from the FENV
work.

Is there a bug report with a C or IR example? You want to generate compare
and branch instead of a cmov for something like this?

int foo(float x) {
  if (x < 42.0f)
    return x;
  return 12;
}

define i32 @foo(float %x) {
  %cmp = fcmp olt float %x, 4.200000e+01
  %conv = fptosi float %x to i32
  %ret = select i1 %cmp, i32 %conv, i32 12
  ret i32 %ret
}

$ clang -O2 cmovfp.c -S -o -
    movss    LCPI0_0(%rip), %xmm1    ## xmm1 = mem[0],zero,zero,zero
    ucomiss    %xmm0, %xmm1
    cvttss2si    %xmm0, %ecx
    movl    $12, %eax
    cmoval    %ecx, %eax
    retq

On Thu, May 11, 2017 at 1:28 PM, Kaylor, Andrew via llvm-dev <
llvm-dev at lists.llvm.org> wrote:

> Hi Michael,
>
>
>
> To be honest I haven’t started working on FP to integer conversions for
> FENV_ACCESS yet.
>
>
>
> To some extent I would consider the issue that you found independent of
> what I’m doing to constrain FP behavior.  That is, I think we ought to make
> the change you’re asking for even apart from the FENV_ACCESS work.  When I
> get to the conversions for FENV_ACCESS support it may require some
> additional constraints, but I think if the branching conversion is usually
> faster (and it looks like it will be) then that should be the default
> behavior.
>
>
>
> I’ll try to look into that.  I’d offer to give you advice on putting
> together a patch, but I’m still learning my way around the ISel code
> myself.  I think I know enough to figure out what to do but not enough to
> tell someone else how to do it without a bunch of wrong turns.
>
>
>
> -Andy
>
>
>
>
>
> *From:* Michael Clark [mailto:michaeljclark at mac.com]
> *Sent:* Wednesday, May 10, 2017 7:59 PM
> *To:* Kaylor, Andrew <andrew.kaylor at intel.com>
> *Cc:* llvm-dev <llvm-dev at lists.llvm.org>
> *Subject:* FENV_ACCESS and floating point LibFunc calls
>
>
>
> Hi Andy,
>
>
>
> I’m interested to try out your patches…
>
>
>
> I understand the scope of FENV_ACCESS is relatively wide, however I’m
> still curious if you managed to figure out how to prevent the
> SelectionDAGLegalize::ExpandNode() FP_TO_UINT lowering of the FPToUI
> intrinsic from producing the predicate logic that incorrectly sets the
> floating point accrued exceptions due to unconditional execution of the
> ISD::FSUB node attached to the SELECT node. It’s a little above my head to
> try to solve this issue with my current understanding of LLVM but I could
> give it a try. I’d need some guidance as to how the lowering of SELECT can
> be controlled. i.e. where LLVM decides whether and how to lower a select
> node as a branch vs predicate logic.
>
>
>
> I’d almost forgotten that we microbenchmarked this and found the branching
> version is faster with regular input (< LLONG_MAX).
>
>
>
> - https://godbolt.org/g/ytgk7l
>
> All, Where does LLVM decide to lower select as predicate logic vs branches
> and how does the cost model work? I’m curious about a tuning flag to
> generate branches instead of computing both values and using conditional
> moves…
>
>
>
> Best,
>
> Michael.
>
>
>
> On 11 May 2017, at 11:41 AM, via llvm-dev <llvm-dev at lists.llvm.org> wrote:
>
>
>
> Hi all,
>
> Background
> I've been working on adding the necessary support to LLVM for clang to be
> able to support the STDC FENV_ACCESS pragma, which basically allows users
> to modify rounding mode at runtime and depend on the value of
> floating-point status flags or to unmask floating point exceptions without
> unexpected side effects.  I've committed an initial patch (r293226) that
> adds constrained intrinsics for the basic FP operations, and I have a patch
> up for review now (https://reviews.llvm.org/D32319) that adds constrained
> versions of a number of libm-like FP intrinsics.
>
> Current problem
> Now I'm trying to make sure I have a good solution for the way in which
> the optimizer handles recognized calls to libm functions (sqrt, pow, cos,
> sin, etc.).  Basically, I need to prevent all passes from making any
> modifications to these calls that would make assumptions about rounding
> mode or improperly affect the FP status flags (either suppressing flags
> that should be present or setting flags that should not be set).  For
> instance, there are circumstances in which the optimizer will constant fold
> a call to one of these functions if the value of the arguments are known at
> compile time, but this constant folding generally assumes the default
> rounding mode and if the library call would have set a status flag, I need
> the flag to be set.
>
> Question
> My question is, can/should I just rely on the front end setting the
> "nobuiltin" attribute for the call site in any location where the FP
> behavior needs to be restricted?
>
> Ideally, I would like to be able to conditionally enable optimizations
> like constant folding if I am able to prove that the rounding mode, though
> dynamic, is known for the callsite at compile time (the constrained
> intrinsics have a mechanism to enable this), but at the moment I am more
> concerned about correctness and would be willing to sacrifice optimizations
> to get correct behavior.  Long term, I was thinking that maybe I could do
> something like attach metadata to indicate rounding mode and exception
> behavior when they were known.
>
> Is there a better way to do this?
>
> Thanks,
> Andy
>
>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170511/d85ac0c5/attachment.html>