[PATCH] D68930: [InstCombine] Shift amount reassociation in shifty sign bit test (PR43595)

Sat Oct 19 11:04:30 PDT 2019

lebedev.ri added a comment.

In D68930#1715645 <https://reviews.llvm.org/D68930#1715645>, @spatel wrote:

> In D68930#1715209 <https://reviews.llvm.org/D68930#1715209>, @lebedev.ri wrote:
>
> > Ignoring the -O3 issue, i'm not really confident AggressiceInstCombine is a great fit for these problems.
> >  The thing is, these folds are not one-off "folds - yay, no fold - shrug".
> >  There isn't much interest in this fold happening in itself.
> >  The really important part is that it will pave the road for other folds to happen.
> >  In particular, this fold will allow e.g. rL373964 <https://reviews.llvm.org/rL373964> to happen.
> >  The situation is likely similar with all other "aggressive" folds i'we added.
>
>
> Yes, I understand this patch alone doesn't look like a big change.

To recap, the main problem is not that they use InstSimplify,
but that `SimplifyAssociativeBinOp()` uses recursive reasoning, correct?

In this case before we ever call `SimplifyAddInst()` we need to match 3 instructions - `!= 0` + 2 right shifts.
`reassociateShiftAmtsOfTwoSameDirectionShifts()` for example only requires just 2 instructions,
`foldShiftIntoShiftInAnotherHandOfAndInICmp()` - 4 instructions,
`dropRedundantMaskingOfLeftShiftInput()` - 3/4 instructions.

So *this* case is certainly not worst,
i`m honestly not sure what's re-triggered this disscussion yet again...

> (We'd move the whole reassociateShiftAmtsOfTwoSameDirectionShifts() if we decide to move anything?) 
>  But it should be ok. We can assume that regular and aggressive will work together,
>  and we can make sure that doesn't break by adding a test within PhaseOrdering.
>  Obviously, it's more work to do it that way, so that's probably why we haven't done it often.

To be perfectly honest i don't really see/don't look forward that happening,
because i don't really believe just a single pass of AIC followed by IC will be enough to still handle everything.
i suspect it would need to be IC-AIC-IC-AIC-IC, and right now AIC only runs once.

> An alternative could be to guard this code (and possibly many more) with:
> 
>   if (ExpensiveCombines)
> 
> ?

Hm, `opt -instcombine` does result in `ExpensiveCombines=1`, `opt/clang -O0/-O1/-O2` doesn't, `opt/clang -O3` does.
So InstCombine's `ExpensiveCombines` is basically in-pass AIC (i.e. they run at same optlevels),
just without all the issues that will arise from not running combines till completion.
I suppose this could work, but i *really* don't like -O3 limitation (should be -O2 in these cases,
but not for the sole existing transform guarded by `ExpensiveCombines`).

All that being said, if it's really this much of a concern to warrant immediate action,
can't we simply expose `"Depth"` in the `llvm::SimplifyAddInst()`
and call it with such a depth that either disallows recursive reasoning,
or severely limits it? Some fastpaths will likely needed to be added then most likely.
That may be the best outcome considering the options..

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D68930/new/

https://reviews.llvm.org/D68930