Patch to fix vselect compilation on KNL.
Nadav Rotem
nrotem at apple.com
Wed Dec 4 09:14:30 PST 2013
LGTM!
On Dec 4, 2013, at 6:06 AM, Cameron McInally <cameron.mcinally at nyu.edu> wrote:
> Hey Aleksey,
>
> On Tue, Dec 3, 2013 at 3:15 PM, Bader, Aleksey A
> <aleksey.a.bader at intel.com> wrote:
>> Hi LLVM developers,
>>
>>
>>
>> I need your help with reviewing and committing the patch.
>>
>>
>>
>> Build of that program:
>>
>> define <16 x i32> @test() {
>>
>> entry:
>>
>> %0 = icmp slt <16 x i32> undef, undef
>>
>> %1 = select <16 x i1> %0, <16 x i32> undef, <16 x i32> zeroinitializer
>>
>> ret <16 x i32> %1
>>
>> }
>>
>> Fails on KNL because of optimization in PerformSELECTCombine, which replaces
>> (x < y) ? a : 0 => (x < y) & a.
>>
>> It seems to be profitable only if we keep comparison result in the same
>> register group as ‘a’ (i.e. for vector architectures without dedicated mask
>> registers).
>>
>> There is no need in this optimization on KNL because vselect can be
>> implemented as masked move (single instruction).
>
> This LGTM. I was just looking at the same issue, so thanks for this. :)
>
> -Cameron
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20131204/bf4da3aa/attachment.html>
More information about the llvm-commits
mailing list