[PATCH][AVX512] Fix miscompile for unpack

Demikhovsky, Elena elena.demikhovsky at intel.com
Wed Sep 10 01:57:34 PDT 2014


Hi Adam,

You are right, thank you for fixing this.
You can commit this change.


-           Elena

From: Adam Nemet [mailto:anemet at apple.com]
Sent: Wednesday, September 10, 2014 11:06
To: Demikhovsky, Elena
Cc: llvm-commits
Subject: Re: [PATCH][AVX512] Fix miscompile for unpack

Ping^2?

On Sep 2, 2014, at 10:55 AM, Adam Nemet <anemet at apple.com<mailto:anemet at apple.com>> wrote:

> Ping?
>
> On Aug 13, 2014, at 11:48 PM, Adam Nemet <anemet at apple.com<mailto:anemet at apple.com>> wrote:
>
>> Hi Elena,
>>
>> r189189 implemented AVX512 unpack by essentially performing a 256-bit unpack
>> between the low and the high 256 bits of src1 into the low part of the
>> destination and another unpack of the low and high 256 bits of src2 into the
>> high part of the destination.
>>
>> I don't think that's how unpack works.  AVX512 unpack simply has more 128-bit
>> lanes but other than it works the same way as AVX.  So in each 128-bit lane, we're
>> always interleaving certain parts of both operands rather different parts of
>> one of the operands.
>>
>> E.g. for this:
>> __v16sf a = { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15 };
>> __v16sf b = { 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31 };
>> __v16sf c = __builtin_shufflevector(a, b, 0, 8, 1, 9, 4, 12, 5, 13, 16,
>>                                            24, 17, 25, 20, 28, 21, 29);
>>
>> we generated punpcklps (notice how the elements of a and b are not interleaved
>> in the shuffle).  In turn, c was set to this:
>>
>> 0 16 1 17 4 20 5 21 8 24 9 25 12 28 13 29
>>
>> Obviously this should have just returned the mask vector of the shuffle
>> vector.
>>
>> I mostly reverted this change and made sure the original AVX code worked
>> for 512-bit vectors as well.
>>
>> Also updated the tests because they matched the logic from the code.
>>
>> Please let me know if this looks good.
>>
>> Thanks,
>> Adam
>>
>
---------------------------------------------------------------------
Intel Israel (74) Limited

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140910/94c5a341/attachment.html>


More information about the llvm-commits mailing list