[cfe-dev] Inefficient code generation for _mm_test{z, c, nzc} (SSE4.1)
Craig Topper
craig.topper at gmail.com
Wed Apr 11 23:38:44 PDT 2012
Interestingly, the AVX ptest intrinsics are correctly taking 4 x i64
arguments. I'll fix the 128-bit versions to take 2 x i64.
On Wed, Apr 11, 2012 at 2:41 AM, Florian Pflug <fgp at phlo.org> wrote:
> Hi
>
> I've stumbled over a deficiency in clang's codegen for the SSE4.1
> _mm_test* intrinsics. These intrinsics are supposed to map to the PTEST
> instruction, which sets the ZF (zero flag) and CF (carry flag) depending on
> whether the bitwise AND (or ANDNOT for CF) of two SSE registers is all zero
> or not. The construct
>
> if (_mm_test{z,c,nzc}_si128(v, m))
> …
>
> should thus produce a PTEST instruction followed by a branch instruction
> (JZ for _mm_testz_si128, JC fr _mm_testc_si128, JNBE for
> _mm_testnzc_si128). Clang, however, instead produces something like
>
> PTEST …
> SETE %al
> MOVZBL %al, %eax
> TEST %eax, %eax
> JNE ...
>
> Also, the LLVM bitcode looks a tad strange. For
>
> if (_mm_testz_si128(v,v))
> body();
>
> Clang generates
>
> %2 = tail call i32 @llvm.x86.sse41.ptestz(<4 x float> %1, <4 x float> %1)
> nounwind
> %3 = icmp eq i32 %2, 0
> br i1 %3, label %5, label %4
> ; <label>:4 ; preds = %0
> tail call void (...)* @body() nounwind
> br label %5
> ; <label>:5 ; preds = %4, %0
> ret void
>
> Since _mm_testz_si128 uses __m128i (the integer SSE type), *not* __m128
> (the single-precision float SSE type), it seems strange that the
> corresponding LLVM intrinsic takes parameters of type float.
>
> I'm not sure whether fixing this involves changing Clang or LLVM (or
> both?), which is why I haven't filed a bug report so far, but instead
> posted this here.
>
> Funnily enough, GCC 4.2 (at least the OSX version) has the same problem.
> Later GCC versions get it right, though.
>
> best regards,
> Florian Pflug
>
>
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>
--
~Craig
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20120411/b69529bd/attachment.html>
More information about the cfe-dev
mailing list