[LLVMdev] Please benchmark new x86 vector shuffle lowering, planning to make it the default very soon!
Simon Pilgrim
llvm-dev at redking.me.uk
Sun Sep 21 13:15:49 PDT 2014
On 20 Sep 2014, at 19:44, Chandler Carruth <chandlerc at google.com> wrote:
> If AVX is available I would expect the vpermilps/vpermilpd instruction to be used for all float/double single vector shuffles, especially as it can deal with the folded load case as well - this would avoid the integer/float execution domain transfer issue with using vpshufd.
>
> Yes, this is the obvious solution to folding memory loads. It just isn't implemented yet.
>
> Well, actually it is, but I haven't finished writing tests for it. =]
Thanks Chandler - vpermilps/vpermilpd generation looks great now.
I've found another regression - byte shifts on pre-ssse3 targets are failing to make use of the vpslldq/vpsrldq instructions - I've attached some basic test cases.
Could vpslldq/vpsrldq be used on ssse3+ targets for the cases where zeros are being shifted in? It avoids the need for a zero register (although they aren't as good for memory folding).
Cheers, Simon.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: byte_shift.ll
Type: application/octet-stream
Size: 4589 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20140921/309c3196/attachment.obj>
More information about the llvm-dev
mailing list