Patch to fix vselect compilation on KNL.

Wed Dec 4 06:06:27 PST 2013

Hey Aleksey,

On Tue, Dec 3, 2013 at 3:15 PM, Bader, Aleksey A
<aleksey.a.bader at intel.com> wrote:
> Hi LLVM developers,
>
>
>
> I need your help with reviewing and committing the patch.
>
>
>
> Build of that program:
>
> define <16 x i32> @test() {
>
> entry:
>
>   %0 = icmp slt <16 x i32> undef, undef
>
>   %1 = select <16 x i1> %0, <16 x i32> undef, <16 x i32> zeroinitializer
>
>   ret <16 x i32> %1
>
> }
>
> Fails on KNL because of optimization in PerformSELECTCombine, which replaces
> (x < y) ? a : 0 => (x < y) & a.
>
> It seems to be profitable only if we keep comparison result in the same
> register group as ‘a’ (i.e. for vector architectures without dedicated mask
> registers).
>
> There is no need in this optimization on KNL because vselect can be
> implemented as masked move (single instruction).

This LGTM. I was just looking at the same issue, so thanks for this. :)

-Cameron