[PATCH] [X86] Replace avx2.pbroadcast intrinsics with native IR.
Sanjay Patel
spatel at rotateright.com
Fri Jun 19 14:24:30 PDT 2015
In http://reviews.llvm.org/D10555#191124, @ab wrote:
> To make sure I understand: this is only a problem because of DAGCombines running at -O0, right? (and perhaps some of the lowering being too smart? though without combines I'd find that surprising)
> And this in turn is only a problem because the C intrinsics (_mm_*) are always inlined, and thus can be combined, right?
I think the problem is independent of inlining and DAGCombines. As an example, consider this:
__m128 foo(__m256 a) {
return _mm256_extractf128_ps(a, 0);
}
After http://reviews.llvm.org/D8275, this becomes a shufflevector in clang, and there's not much hope of turning it back into a vextractf128. It becomes an ISD::EXTRACT_SUBVECTOR in the DAG without any combiner opts AFAICT. Then, it turns into a EXTRACT_SUBREG machine inst. Then, it's either just a move or nothing at all in x86.
http://reviews.llvm.org/D10555
EMAIL PREFERENCES
http://reviews.llvm.org/settings/panel/emailpreferences/
More information about the llvm-commits
mailing list