[PATCH] D38316: [InstCombine] replace bitcast to scalar + insertelement with widening shuffle + vector bitcast

Wed Sep 27 10:58:11 PDT 2017

efriedma added subscribers: arsenm, tra.
efriedma added a comment.

ARM/AArch64 are very similar in this respect, since there are multiple vector register sizes.  You'll see a similar result for your examples on aarch64.  (On 32-bit ARM, we manage to optimize away the extra copy after isel.)  I'm not quite sure how much of this logic it makes sense to put into instcombine, given most of the benefit here has to do with the way CPUs split integer and vector registers, but this is probably okay for other targets.

Does it make sense to do this transform even if the operand of the insertelement isn't undef?  I guess you'd need a second shuffle in that case?

Took me a minute to reason it out, but I think this is right on big-endian targets: semantically, the bitcast in both cases is essentially equivalent.

https://reviews.llvm.org/D38316