[LLVMdev] Float compare-for-equality and select optimizationopportunity

Marc B. Reynolds marc.reynolds at orange.fr
Tue May 27 05:49:31 PDT 2008


Nicolas:
The LLVM mail server is being slow, so I'm direct e-mailing my comment.  The
generated code is IEEE correct...use CreateFCmpUEQ for unordered or equal,
which should generate your hand written version....I'm assuming that your
next e-mail (min/max) is probably the same thing.

-----Original Message-----
From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On
Behalf Of Nicolas Capens
Sent: Tuesday, May 27, 2008 11:10 AM
To: 'LLVM Developers Mailing List'
Subject: [LLVMdev] Float compare-for-equality and select
optimizationopportunity



Hi all,

 

I'm trying to generate code containing an ordered float compare for
equality, and select. The resulting code however has an unordered compare
and some Boolean logic that I think could be eliminated. In C syntax the
code looks like this:

 

float x, y;

int a, b, c

 

if(x == y)   // Rotate the integers

{

            int t;

 

            t = a;

            a = b;

            b = c;

            c = t;

}

 

This is the resulting x86 assembly code:

 

movss       xmm0,dword ptr [ecx+4] 

ucomiss     xmm0,dword ptr [ecx+8] 

sete        al   

setnp       dl   

test        dl,al 

mov         edx,edi 

cmovne      edx,ecx 

cmovne      ecx,esi 

cmovne      esi,edi

 

While I'm pleasantly surprised that my branch does get turned into several
select operations as intended (cmov - conditional move - in x86), I'm
confused why it uses the ucomiss instruction (unordered compare and set
flags). I only used IRBuilder::CreateFCmpOEQ. It also appears to invert the
conditional, for no clear reason. I think it could be rewritten as follows:

 

movss       xmm0,dword ptr [ecx+4] 

comiss      xmm0,dword ptr [ecx+8] 

mov         edx,edi 

cmove       edx,ecx 

cmove       ecx,esi 

cmove       esi,edi

 

Compared to the original C syntax code this looks pretty straightforward.
Curiously, when I replace the compare-for-equality with something like a
less-than, it does generate such compact code (using comiss and cmova). And
the not-equal case looks like this:

 

movss       xmm0,dword ptr [ecx+4] 

ucomiss     xmm0,dword ptr [ecx+8] 

mov         esi,ecx 

cmove       esi,edx 

cmovne      ecx,eax 

cmove       edx,eax

 

So this generates compact code but with an unordered compare.

 

Anyway, it looks like the compare-for-equality case in particular is missing
an optimization opportunity. It's no big deal to me but I thought someone
here might be interested.

 

Cheers,

 

Nicolas Capens

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20080527/080482ef/attachment.html>


More information about the llvm-dev mailing list