[PATCH] D79078: [VectorCombine] Leave reduction operation to SLP
Roman Lebedev via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Sat May 16 09:29:38 PDT 2020
lebedev.ri added a comment.
In D79078#2040144 <https://reviews.llvm.org/D79078#2040144>, @spatel wrote:
> In D79078#2038950 <https://reviews.llvm.org/D79078#2038950>, @lebedev.ri wrote:
>
> > In D79078#2010024 <https://reviews.llvm.org/D79078#2010024>, @spatel wrote:
> >
> > > I also sympathize with trying to solve this here rather than SLP. One of the reasons vector-combine exists is because SLP became too hard to reason about. In hindsight, we should have created a separate pass for reductions - those are not traditional SLP concerns. Just my opinion. :)
> >
> >
> > I'm not sure what you have in mind here?
> > That *this* pass should also form such reductions?
> > Or that we should not disturb them after SLP formed them?
> > Or something else?
>
>
> The reduction logic is a complicated blob of code, so I don't think it belongs here. I'd split it off from SLP into its own pass, but it looks like a lot of untangling.
> Currently, we're running this pass *before* SLP only. We could move this after SLP to make sure we are not disturbing reductions before SLP has a chance to recognize them...but I'm not sure if that would also now cause regressions. I don't have a good feel for how these passes are interacting.
>
> What does it take to cause the infinite looping that you found?
No, i mean in the case if we would be forming reductions here,
because i think we'd then have two conflicting transforms here,
and they would cause traditional instcombine/dagcombine-esque infinite combine loop.
> Looking at that 1st test - if we allow iteration in this pass, we'll end up with:
>
> define i32 @ext_ext_reduction(<4 x i32> %x, <4 x i32> %y) {
> %and = and <4 x i32> %x, %y
> %1 = shufflevector <4 x i32> %and, <4 x i32> undef, <4 x i32> <i32 1, i32 undef, i32 undef, i32 undef>
> %2 = or <4 x i32> %1, %and
> %3 = shufflevector <4 x i32> %and, <4 x i32> undef, <4 x i32> <i32 2, i32 undef, i32 undef, i32 undef>
> %4 = or <4 x i32> %3, %2
> %5 = shufflevector <4 x i32> %and, <4 x i32> undef, <4 x i32> <i32 3, i32 undef, i32 undef, i32 undef>
> %6 = or <4 x i32> %5, %4
> %7 = extractelement <4 x i32> %6, i64 0
> ret i32 %7
> }
>
>
>
> And nothing knows how to form the optimal reduction from that pattern. We could say that's the real problem - source code could be in that form originally, so we just miss the reassociation optimization opportunity.
That would be the outcome i would prefer, yes.
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D79078/new/
https://reviews.llvm.org/D79078
More information about the llvm-commits
mailing list