[PATCH] Teach the DAGCombiner how to fold a OR of two shufflevector into a single shufflevector node
Tim Northover
t.p.northover at gmail.com
Wed Mar 5 12:19:10 PST 2014
> And that having been said, maybe the specifics of what you're doing could
> be a useful canonicalization -- you'd have to provide additional details.
It's basically the function "CollectShuffleElements" in
lib/Transform/InstCombine/InstCombineVectorOps.cpp. Its job appears to
be to hunt backwards from an insertelement/extractlement pair and
construct a shuffle by any means necessary. I think it can produce
arbitrary shuffles at the moment, provided the type doesn't change
half-way through[1].
I'm trying to extend this so that the eventual type *can* be different
from the inputs (to avoid "(scalar_to_vector (extract_vector_elt
...))" sequences in the backends, primarily). Perhaps this makes sense
because (de facto) the only cases considered are insert/extract, which
are probably trying to build a vector anyway. But Nadav's comments
gave me pause.
I realise my initial problem could be solved with a target-specific
DAGCombine, but if the consensus is that's the best path then the
existing code needs a serious look because it's almost certainly too
general as well.
Cheers.
Tim.
[1]. For example (indices picked pretty much randomly and it Just
Worked), try "opt -instcombine" on this:
define <4 x i32> @foo(<4 x i32> %in1, <4 x i32> %in2) {
%e0 = extractelement <4 x i32> %in1, i32 3
%e1 = extractelement <4 x i32> %in1, i32 1
%e2 = extractelement <4 x i32> %in1, i32 3
%e3 = extractelement <4 x i32> %in2, i32 0
%vec.0 = insertelement <4 x i32> undef, i32 %e0, i32 0
%vec.1 = insertelement <4 x i32> %vec.0, i32 %e1, i32 1
%vec.2 = insertelement <4 x i32> %vec.1, i32 %e2, i32 2
%vec.3 = insertelement <4 x i32> %vec.2, i32 %e3, i32 3
ret <4 x i32> %vec.3
}
More information about the llvm-commits
mailing list