[PATCH] D10804: [SDAG] Optimize unordered comparison in soft-float mode

Mon Jul 6 12:02:02 PDT 2015

Thanks for the update!  To me, the logic is sound.  Nitpick replies below.


================
Comment at: test/CodeGen/AArch64/arm64-fp128.ll:167
@@ -171,3 +166,3 @@
 
   ; olt == !uge, which LLVM unfortunately "optimizes" this to.
   %cond = fcmp olt fp128 %lhs, %rhs
----------------
anadolskiy wrote:
> ab wrote:
> > Can this comment be removed now?  I guess your change lets us undo some other transform (which you should probably look into by the way)
> This comment explains line 171: why we compare with GE, instead of LT (that should be in case of ordered LT)
Hmm, that's not how I understood the comment.

Checking both uno/__unordtf2 and oge/__getf2 is indeed an 'unfortunate "optimization"'.  We used to do something like:

IR:
    %1 = fcmp olt ...
    br i1 %1, label %iftrue, label %iffalse

SelectionDAG:
    %1 = fcmp olt ...
    %2 = xor i1 %1, 1
    brcond i1 %2, label %iffalse
    br label %iftrue

Optimized SelectionDAG:
    %1 = fcmp uge ...
    brcond i1 %2, label %iffalse
    br label %iftrue

It's this optimization that the comment complains about, I think, because a softened uge used to need two libcalls (as opposed to 1 for olt).


Whereas I read:
    bl __lttf2
    cmp w0, #0
    b.ge <iffalse>
  iftrue:

as a straightforward lowering for:

    %1 = fcmp olt ...
    %2 = xor i1 %1, 1
    brcond i1 %2, label %iffalse
    br label %iftrue

via (given that __lttf2 returns < 0 / -1 on "olt" only):

    %1 = call @__lttf2
    %2 = icmp lt i32 %1, 0
    %3 = xor i1 %2, 1
    brcond i1 %3, label %iffalse
    br label %iftrue

itself simplified to:

    %1 = call @__lttf2
    %2 = icmp ge i32 %1, 0
    brcond i1 %2, label %iffalse
    br label %iftrue


In any case, not a big deal, but I might have missed something, so: am I making sense?

================
Comment at: test/CodeGen/Thumb2/float-cmp.ll:84
@@ -83,4 +83,3 @@
 ; CHECK-LABEL: cmp_f_ugt:
-; NONE: bl __aeabi_fcmpgt
-; NONE: bl __aeabi_fcmpun
+; NONE: bl __aeabi_fcmple
 ; HARD: vcmpe.f32
----------------
anadolskiy wrote:
> ab wrote:
> > Hmm, these might benefit from stricter CHECK/NONE, since I guess the result gets inverted?
> Sorry, I didn't understand that. Can you please explain?
The IR does an UGT comparison, but this patch checks for OLE.  Since UGT == !OLE, I imagine there's some code right after the call that inverts r0 somehow.

I'm saying it might be a good idea to check (in this case, using "NONE:") that code as well.

================
Comment at: test/CodeGen/X86/fp-cmp-sf.ll:1
@@ +1,2 @@
+; RUN: llc < %s -march=x86 -mcpu=pentium -float-abi=soft | FileCheck %s 
+
----------------
anadolskiy wrote:
> ab wrote:
> > Can you chmod this file as non-executable?  Also, what about "fpcmp-soft-float" or "fpcmp-soft-fp"? "sf" isn't descriptive enough IMO.
> Why chmod this as non-executable?  Other tests don't do that.
I mean this test file (fp-cmp-sf.ll) is executable in your patch (see the svn property a few lines above on phabricator).  Before committing, please change your local fp-cmp-sf.ll file to not be executable:

    chmod -x test/CodeGen/X86/fp-cmp-sf.ll

You might also have to

    svn propdel svn:executable test/CodeGen/X86/fp-cmp-sf.ll

if you svn add-ed it already.


Repository:
  rL LLVM

http://reviews.llvm.org/D10804