[cfe-dev] Was the SSE2 _mm_slli_si128 / _mm_srli_si128 behavior change in r272246 intentional?

Fri Jun 24 23:38:49 PDT 2016

That was not intentional at all. I foolishly thought that sending
everything about 15 to the true leg of this ternary operator would prevent
needing a mask on each of the immediate uses in the shufflevector. I'll fix
it right away. AVX2 and AVX512 are similarly broken.

  ((char)(imm)&0xF0) ? _mm_setzero_si128() :
    \
                     (__m128i)__builtin_shufflevector(
    \

 (__v16qi)_mm_setzero_si128(), \
                                                 (__v16qi)(__m128i)(a),
   \
                                                 16 - (char)(imm),
    \
                                                 17 - (char)(imm),
    \
                                                 18 - (char)(imm),
    \
                                                 19 - (char)(imm),
    \
                                                 20 - (char)(imm),
    \
                                                 21 - (char)(imm),
    \
                                                 22 - (char)(imm),
    \
                                                 23 - (char)(imm),
    \
                                                 24 - (char)(imm),
    \
                                                 25 - (char)(imm),
    \
                                                 26 - (char)(imm),
    \
                                                 27 - (char)(imm),
    \
                                                 28 - (char)(imm),
    \
                                                 29 - (char)(imm),
    \
                                                 30 - (char)(imm),
    \
                                                 31 - (char)(imm)); })

On Fri, Jun 24, 2016 at 6:10 PM, Brooks Moses <bmoses at google.com> wrote:

> Craig et al -
>
> We recently ran into a code bug that was revealed by the way that r272246
> ("[X86] Handle AVX2 pslldqi and psrldqi intrinsics shufflevector creation
> directly in the header file instead of in CGBuiltin.cpp. Simplify the sse2
> equivalents as well.") changed the behavior of the SSE2 _mm_slli_si128 and
> _mm_srli_si128 intrinsics.
>
> Previously, these intrinsics worked correctly with all values of shift
> operands, and would produce a result of zero for any shift value outside of
> the [0-15] range.
>
> With this change, any value of the shift operand outside of the range
> [0-16] -- note, that's 16, not 15 -- will cause a compilation error.
>
> Now, I can see good arguments for why limiting the range to [0-16] would
> be a good idea; after all, it did reveal a code bug in our case.  However,
> this behavior change wasn't mentioned in the change description, and I can
> also see good arguments for why one wouldn't want this limit -- so I figure
> it's a good idea to raise the question explicitly: Was this behavior change
> intentional?  If not, is it nonetheless the desired behavior?
>
> - Brooks
>
>

-- 
~Craig
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20160624/5269ef15/attachment.html>