[PATCH] D10804: [SDAG] Optimize unordered comparison in soft-float mode
Ahmed Bougacha
ahmed.bougacha at gmail.com
Mon Jul 6 12:02:02 PDT 2015
Thanks for the update! To me, the logic is sound. Nitpick replies below.
================
Comment at: test/CodeGen/AArch64/arm64-fp128.ll:167
@@ -171,3 +166,3 @@
; olt == !uge, which LLVM unfortunately "optimizes" this to.
%cond = fcmp olt fp128 %lhs, %rhs
----------------
anadolskiy wrote:
> ab wrote:
> > Can this comment be removed now? I guess your change lets us undo some other transform (which you should probably look into by the way)
> This comment explains line 171: why we compare with GE, instead of LT (that should be in case of ordered LT)
Hmm, that's not how I understood the comment.
Checking both uno/__unordtf2 and oge/__getf2 is indeed an 'unfortunate "optimization"'. We used to do something like:
IR:
%1 = fcmp olt ...
br i1 %1, label %iftrue, label %iffalse
SelectionDAG:
%1 = fcmp olt ...
%2 = xor i1 %1, 1
brcond i1 %2, label %iffalse
br label %iftrue
Optimized SelectionDAG:
%1 = fcmp uge ...
brcond i1 %2, label %iffalse
br label %iftrue
It's this optimization that the comment complains about, I think, because a softened uge used to need two libcalls (as opposed to 1 for olt).
Whereas I read:
bl __lttf2
cmp w0, #0
b.ge <iffalse>
iftrue:
as a straightforward lowering for:
%1 = fcmp olt ...
%2 = xor i1 %1, 1
brcond i1 %2, label %iffalse
br label %iftrue
via (given that __lttf2 returns < 0 / -1 on "olt" only):
%1 = call @__lttf2
%2 = icmp lt i32 %1, 0
%3 = xor i1 %2, 1
brcond i1 %3, label %iffalse
br label %iftrue
itself simplified to:
%1 = call @__lttf2
%2 = icmp ge i32 %1, 0
brcond i1 %2, label %iffalse
br label %iftrue
In any case, not a big deal, but I might have missed something, so: am I making sense?
================
Comment at: test/CodeGen/Thumb2/float-cmp.ll:84
@@ -83,4 +83,3 @@
; CHECK-LABEL: cmp_f_ugt:
-; NONE: bl __aeabi_fcmpgt
-; NONE: bl __aeabi_fcmpun
+; NONE: bl __aeabi_fcmple
; HARD: vcmpe.f32
----------------
anadolskiy wrote:
> ab wrote:
> > Hmm, these might benefit from stricter CHECK/NONE, since I guess the result gets inverted?
> Sorry, I didn't understand that. Can you please explain?
The IR does an UGT comparison, but this patch checks for OLE. Since UGT == !OLE, I imagine there's some code right after the call that inverts r0 somehow.
I'm saying it might be a good idea to check (in this case, using "NONE:") that code as well.
================
Comment at: test/CodeGen/X86/fp-cmp-sf.ll:1
@@ +1,2 @@
+; RUN: llc < %s -march=x86 -mcpu=pentium -float-abi=soft | FileCheck %s
+
----------------
anadolskiy wrote:
> ab wrote:
> > Can you chmod this file as non-executable? Also, what about "fpcmp-soft-float" or "fpcmp-soft-fp"? "sf" isn't descriptive enough IMO.
> Why chmod this as non-executable? Other tests don't do that.
I mean this test file (fp-cmp-sf.ll) is executable in your patch (see the svn property a few lines above on phabricator). Before committing, please change your local fp-cmp-sf.ll file to not be executable:
chmod -x test/CodeGen/X86/fp-cmp-sf.ll
You might also have to
svn propdel svn:executable test/CodeGen/X86/fp-cmp-sf.ll
if you svn add-ed it already.
Repository:
rL LLVM
http://reviews.llvm.org/D10804
More information about the llvm-commits
mailing list