[PATCH] D46742: [X86] Use __builtin_convertvector to replace some of the avx512 truncate builtins.

Fri May 11 03:51:15 PDT 2018

GBuella added a comment.

In https://reviews.llvm.org/D46742#1095658, @tkrupa wrote:

> There are four other similar intrinsics which convert to 128/256-bit vectors:
>
> __m128i _mm256_cvtepi32_epi8 (__m256i a)
>  __m128i _mm256_cvtepi64_epi16 (__m256i a)
>  __m128i _mm256_cvtepi64_epi8 (__m256i a)
>  __m128i _mm512_cvtepi64_epi8 (__m512i a)
>
> Can you also include them?

Probably these should be possible, but e.g. with the _mm256_cvtepi32_epi8 case, I can only get this far:

  vpmovdw %ymm0, %xmm0
  vpshufb .LCPI2_0(%rip), %xmm0, %xmm0 # xmm0 = xmm0[0,2,4,6,8,10,12,14],zero,zero,zero,zero,zero,zero,zero,zero
  vzeroupper
  retq

While the expected result is a `vpmovdb` instruction, without the extra shuffling.

Repository:
  rC Clang

https://reviews.llvm.org/D46742