[PATCH] D10555: [X86] Replace avx2.pbroadcast intrinsics with native IR.
Simon Pilgrim via llvm-commits
llvm-commits at lists.llvm.org
Sat Aug 15 11:15:19 PDT 2015
RKSimon added a comment.
So I revisited this as I've been messing with instcombiner reduction of intrinsics a lot recently.
Looking at the O0/O1/O2 codegen, the pbroadcast (and the broadcastss/broadcastsd register variants) are well behaved and keep to the expected instructions - we're not doing anything different here to how many of the other shuffle intrinsics are already implemented in the headers. The only one that has problems is _mm256_broadcastsi128_si256 (vbroadcasti128) which isn't being proposed here.
Along with an update of avx2intrin.h to call __builtin_shufflevector directly (and suitable tests to ensure that debug code doesn't change in the future) I'd say that this should be a win, but if people are still hesitant we should at least push forward with support in instcombiner now instead of putting it off.
http://reviews.llvm.org/D10555
More information about the llvm-commits
mailing list