[PATCH] D33338: [InstCombineCasts] Take in account final size when transforming sext->lshr->trunc patterns

Mon May 22 08:38:53 PDT 2017

spatel added a subscriber: dberlin.
spatel added a comment.

In https://reviews.llvm.org/D33338#760470, @davide wrote:

> In https://reviews.llvm.org/D33338#760426, @spatel wrote:
>
> > This is strictly safer (does the transform less often), so LGTM. Bonus: we're not creating an unnecessary cast in the case where the result size is the same as the source op.
> >
> > @davide, let me know if you plan to work on the (lshr (sext X), C) patterns. I checked in the tests with https://reviews.llvm.org/rL303504.
>
>
> I do plan to work on them, but if you have some bandwidth, any help would be appreciated :)
>  Should we open a bug so that we don't forget?

Bugs reports are always good for me, but since this one is in mental cache right now, I'd rather just squash it now. :)

The interesting thing about this case is that it feeds into the larger current discussion about how to split up InstCombine to make the whole thing less obnoxious.

Take these two cases:

  Name: shift_less
  %sext = sext i3 %x to i8
  %hibit = lshr i8 %sext, 7
    =>
  %tr = lshr i3 %x, 2
  %hibit = zext i3 %tr to i8

  Name: bool_hibit
  %sext = sext i1 %x to i8
  %hibit = lshr i8 %sext, 7
    =>
  %hibit = zext i1 %x to i8

If we think of InstCombine as a pure canonicalizer, we might make a lshr rule like (semi-pseudo-code):

  // Are we moving the sign bit to the low bit and widening a value with high zeros?
  // lshr (sext X), C --> zext (lshr X, C2)
  if (match(Op0, m_SExt(m_Value(X))) && 
      match(Op1, m_APInt(C)) && 
      C->getZextValue() == Op0->getScalarSizeInBits() - 1) {
     return createZext(createLShr(X, X->getScalarSizeInBits() - 1));
  }

In other words, we would not special-case the i1 (bool) example and eliminate this no-op before it ever existed:
%tr = lshr i3 %x, 0
...we'd leave that to the optimizer pass because eliminating that instruction is an optimization, not a canonicalization.

Now, we might just say this is a bug in the IRBuilder - why doesn't it check for a zero shift constant before creating an obviously unnecessary instruction? On the other hand, that's not its job. It shouldn't think at all about that special case; that's the caller's responsibility.

We'll find many examples like this if we start trying to create a pure canonicalizer. If I interpreted the statement correctly, @dberlin mentioned this potential inefficiency in:
http://lists.llvm.org/pipermail/llvm-dev/2017-May/113220.html

Repository:
  rL LLVM

https://reviews.llvm.org/D33338