[LLVMdev] Bug #16941
Dmitry Babokin
babokin at gmail.com
Fri Oct 25 14:16:39 PDT 2013
Nadav,
The problem appears only for vectors longer than available hardware
register (in doubleword elements, i.e. more than 4 on SSE4 and more than 8
on AVX). Select does weird thing. <8 x i1> mask comes as two XMM registers,
select converts them to a single XMM registers (i.e. 8 x 16 bit),
immediately after it converts back to two XMM registers and does blend.
Conversion forth and back has huge overhead.
I'm attaching 3 files with vectors of length 4, 8 and 16. Try 4 on SEE4 and
you'll see that both cases work well, 8 demonstrates the difference on
SSE4. The same on AVX (8 vs 16).
On Wed, Oct 23, 2013 at 1:41 AM, Nadav Rotem <nrotem at apple.com> wrote:
>
> On Oct 21, 2013, at 12:09 PM, Dmitry Babokin <babokin at gmail.com> wrote:
>
> By the way, I'm curious, is the any reason why you focus on SSE4, not AVX?
> Seems that vectorizer should care the most about the latest silicon.
>
>
> I am interested in looking at the SSE4 code because lowering of AVX code
> is more complicated, especially for masks. The problem that <8 x i1> can
> be legalized to <8 x i32> for YMM, or <8 x i16> for XMM. ISPC worked
> around this limitation by explicitly extending the mask. The SEXT
> canonicalization reverted the code pattern that ISPC generated.
>
> Thanks,
> Nadav
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20131026/59618dad/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: v4.ll
Type: application/octet-stream
Size: 464 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20131026/59618dad/attachment.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: v8.ll
Type: application/octet-stream
Size: 464 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20131026/59618dad/attachment-0001.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: v16.ll
Type: application/octet-stream
Size: 482 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20131026/59618dad/attachment-0002.obj>
More information about the llvm-dev
mailing list