r334208 - [X86] Add back builtins for _mm_slli_si128/_mm_srli_si128 and similar intrinsics.
Martin Storsjö via cfe-commits
cfe-commits at lists.llvm.org
Fri Jun 8 00:59:02 PDT 2018
On Thu, 7 Jun 2018, Craig Topper via cfe-commits wrote:
> Author: ctopper
> Date: Thu Jun 7 10:28:03 2018
> New Revision: 334208
>
> URL: http://llvm.org/viewvc/llvm-project?rev=334208&view=rev
> Log:
> [X86] Add back builtins for _mm_slli_si128/_mm_srli_si128 and similar intrinsics.
>
> We still lower them to native shuffle IR, but we do it in CGBuiltin.cpp now. This allows us to check the target feature and ensure the immediate fits in 8 bits.
FWIW, this change broke building libaom:
https://bugs.chromium.org/p/aomedia/issues/detail?id=1945
In libaom, there's a macro construct like this:
#define v256_shr_n_byte(a, n) \
((n) < 16 \
? _mm256_alignr_epi8( \
_mm256_permute2x128_si256(a, a, _MM_SHUFFLE(2, 0, 0, 1)), a, n) \
: ((n) > 16 \
? _mm256_srli_si256( \
_mm256_permute2x128_si256(a, a, _MM_SHUFFLE(2, 0, 0, 1)), \
(n)-16) \
: _mm256_permute2x128_si256(a, a, _MM_SHUFFLE(2, 0, 0, 1))))
Since this commit, the compilation errors out due to the _mm256_srli_si256
with invalid range, even though the toplevel ternary operator won't
actually pick them to be used. Not sure if there's anything to do from the
clang point of view here, I guess it's a tradeoff between having stricter
parameter checks for the intrinsics, vs the convenience of piling them up
in a macro like this in libaom.
// Martin
More information about the cfe-commits
mailing list