[LLVMdev] Case where VSETCC DAGCombiner hack doesn't work

Chris Lattner sabre at nondot.org
Wed Jul 22 22:54:15 PDT 2009


On Jul 21, 2009, at 11:14 PM, Eli Friedman wrote:
> Testcase (compile with clang >= r76726):
> #include <emmintrin.h>
> __m128i a(__m128 a, __m128 b) { return a==a & b==b; }
>
> CodeGen ends up scalarizing the comparison, which is really bad, and
> AFAIK different from what we did before vsetcc was removed.  The ideal
> code is a single cmpordps, although I don't think clang ever generated
> that for this construct.

Ok, we were missing this specific case because of some instcombine  
xforms that were only applying to scalars, not vectors.  I tweaked  
them to cover vectors and we're getting "perfect" code for this now  
(one cmpordps).

However, not all is sunshine and roses, there are some sad puppydog  
faces left.  Specifically, things like this still get scalarized:

#include <emmintrin.h>
__m128i a(__m128 a, __m128 b, __m128 c) { return a==b & c==b; }

The problem is that the IR going into Codegen has been (nicely)  
simplified to:

define <2 x i64> @a(<4 x float> %a, <4 x float> %b, <4 x float> %c)  
nounwind readnone {
entry:
	%cmp = fcmp oeq <4 x float> %a, %b		; <<4 x i1>> [#uses=1]
	%cmp4 = fcmp oeq <4 x float> %c, %b		; <<4 x i1>> [#uses=1]
	%and6 = and <4 x i1> %cmp, %cmp4		; <<4 x i1>> [#uses=1]
	%and = sext <4 x i1> %and6 to <4 x i32>		; <<4 x i32>> [#uses=1]
	%conv = bitcast <4 x i32> %and to <2 x i64>		; <<2 x i64>> [#uses=1]
	ret <2 x i64> %conv
}

When legalize types sees the sext from <4 x i1> -> <4 x i32>, its only  
solution right now is to scalarize the whole mess feeding into it,  
giving us really atrocious code.

IMO, the solution to this is to have a legalize-types action for  
vectors that corresponds to "promote" on scalars.  In this case, since  
X86 supports VSETCC, the 4 x i1 SETCC should "vector promote" to a  
VSETCC node with a 4xi32 result, the and should vector promote to  
4xi32, and the sext should vector promote as a vector sext_inreg.

I don't think that implementing this is particularly hard, but I have  
plenty of other things I'm working on right now.  Is anyone else  
interested in working on this?

-Chris



More information about the llvm-dev mailing list