[PATCH] Teach the DAGCombiner how to fold a OR of two shufflevector into a single shufflevector node

Wed Mar 5 12:19:10 PST 2014

> And that having been said, maybe the specifics of what you're doing could
> be a useful canonicalization -- you'd have to provide additional details.

It's basically the function "CollectShuffleElements" in
lib/Transform/InstCombine/InstCombineVectorOps.cpp. Its job appears to
be to hunt backwards from an insertelement/extractlement pair and
construct a shuffle by any means necessary. I think it can produce
arbitrary shuffles at the moment, provided the type doesn't change
half-way through[1].

I'm trying to extend this so that the eventual type *can* be different
from the inputs (to avoid "(scalar_to_vector (extract_vector_elt
...))" sequences in the backends, primarily). Perhaps this makes sense
because (de facto) the only cases considered are insert/extract, which
are probably trying to build a vector anyway. But Nadav's comments
gave me pause.

I realise my initial problem could be solved with a target-specific
DAGCombine, but if the consensus is that's the best path then the
existing code needs a serious look because it's almost certainly too
general as well.

Cheers.

Tim.

[1]. For example (indices picked pretty much randomly and it Just
Worked), try "opt -instcombine" on this:

define <4 x i32> @foo(<4 x i32> %in1, <4 x i32> %in2) {
  %e0 = extractelement <4 x i32> %in1, i32 3
  %e1 = extractelement <4 x i32> %in1, i32 1
  %e2 = extractelement <4 x i32> %in1, i32 3
  %e3 = extractelement <4 x i32> %in2, i32 0

  %vec.0 = insertelement <4 x i32> undef, i32 %e0, i32 0
  %vec.1 = insertelement <4 x i32> %vec.0, i32 %e1, i32 1
  %vec.2 = insertelement <4 x i32> %vec.1, i32 %e2, i32 2
  %vec.3 = insertelement <4 x i32> %vec.2, i32 %e3, i32 3

  ret <4 x i32> %vec.3
}