[PATCH] [X86] Replace avx2.pbroadcast intrinsics with native IR.

Sanjay Patel spatel at rotateright.com
Fri Jun 19 14:24:30 PDT 2015

In http://reviews.llvm.org/D10555#191124, @ab wrote:

> To make sure I understand: this is only a problem because of DAGCombines running at -O0, right?  (and perhaps some of the lowering being too smart? though without combines I'd find that surprising)
>  And this in turn is only a problem because the C intrinsics (_mm_*) are always inlined, and thus can be combined, right?

I think the problem is independent of inlining and DAGCombines. As an example, consider this:

  __m128 foo(__m256 a) {
    return _mm256_extractf128_ps(a, 0);

After http://reviews.llvm.org/D8275, this becomes a shufflevector in clang, and there's not much hope of turning it back into a vextractf128. It becomes an ISD::EXTRACT_SUBVECTOR in the DAG without any combiner opts AFAICT. Then, it turns into a EXTRACT_SUBREG machine inst. Then, it's either just a move or nothing at all in x86.



More information about the llvm-commits mailing list