[PATCH] D39421: [InstCombine] Extracting common and-mask for shift operands of Or instruction

Tue Nov 28 10:30:21 PST 2017

spatel added a comment.

In https://reviews.llvm.org/D39421#937773, @opaparo wrote:

> In https://reviews.llvm.org/D39421#937705, @spatel wrote:
>
> > I think inverting the canonicalization of shl+and would make your first test case optimize without this patch
>
>
> Could you please explain why? I'm not sure I'm seeing it.

If we invert the shl+and transform, we don't need this patch to reach optimal code for 3 out of the 6 tests:

  define i32 @or_and_shifts1(i32 %x) {
    %1 = and i32 %x, 1
    %2 = shl nuw nsw i32 %1, 3
    %3 = and i32 %x, 1   <-- CSE will eliminate this
    %4 = shl nuw nsw i32 %3, 5
    %5 = or i32 %2, %4
    ret i32 %5
  }

Similarly (what does this test check that is different from the above?):

  define i32 @or_and_shift_shift_and(i32 %x) {
    %1 = and i32 %x, 7
    %2 = shl nuw nsw i32 %1, 3
    %3 = and i32 %x, 7  <-- CSE will eliminate this
    %4 = shl nuw nsw i32 %3, 2
    %5 = or i32 %2, %4
    ret i32 %5
  }

And again:

  define i32 @multiuse2(i32 %x) {
    %1 = and i32 %x, 126
    %2 = shl nuw nsw i32 %1, 8
    %3 = and i32 %x, 126 <-- CSE will eliminate this
    %4 = shl nuw nsw i32 %3, 1
    %5 = or i32 %2, %4
    ret i32 %5
  }

>> Currently, we end up inverting the canonicalization in the x86 backend (because a smaller constant mask can be created in less instruction bytes), so it would be better to "get it right" here in IR in the first place.
>>  I understand the multi-use case better now with your explanation, so I agree that we want this patch to handle those cases too. But I don't think we should ignore the underlying canonicalization choices just because we know we want to catch the larger patterns.
> 
> I agree that this alternative canonization could prove to be beneficial and more correct. However, I feel that this discussion is orthogonal to this patch, and if it would indeed be decided to switch to the new form then some of the code of this patch, along with several other pieces of code, may need to change accordingly.

Since we can eliminate the need for this patch in half of the tests (note: I checked in the tests at https://reviews.llvm.org/rL319182 , so we can see what they look like currently), I don't think the underlying transform is orthogonal. If this patch would change with the inverted canonicalization, then that's more reason to view the inversion as a preliminary step for this patch. Otherwise, we're adding code unnecessarily. It's possible that inverting shl+and inhibits other folds, and if that's the case, then why not fix that too?

Here's the draft patch I used to check the tests above:

  Index: lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
  ===================================================================
  --- lib/Transforms/InstCombine/InstCombineAndOrXor.cpp	(revision 319170)
  +++ lib/Transforms/InstCombine/InstCombineAndOrXor.cpp	(working copy)
  @@ -1212,6 +1212,13 @@
         return BinaryOperator::CreateOr(And, ConstantInt::get(I.getType(),
                                                               Together));
       }
  +    const APInt *ShlC;
  +    if (match(Op0, m_OneUse(m_Shl(m_Value(X), m_APInt(ShlC))))) {
  +      Constant *NewMask = ConstantInt::get(I.getType(), C->lshr(*ShlC));
  +      Value *NewAnd = Builder.CreateAnd(X, NewMask);
  +      return BinaryOperator::CreateShl(NewAnd, ConstantInt::get(I.getType(),
  +                                                                *ShlC));
  +    }

       // If the mask is only needed on one incoming arm, push the 'and' op up.
       if (match(Op0, m_OneUse(m_Xor(m_Value(X), m_Value(Y)))) ||
  Index: lib/Transforms/InstCombine/InstCombineShifts.cpp
  ===================================================================
  --- lib/Transforms/InstCombine/InstCombineShifts.cpp	(revision 319170)
  +++ lib/Transforms/InstCombine/InstCombineShifts.cpp	(working copy)
  @@ -505,7 +505,7 @@
         // If the operand is a bitwise operator with a constant RHS, and the
         // shift is the only use, we can pull it out of the shift.
         const APInt *Op0C;
  -      if (match(Op0BO->getOperand(1), m_APInt(Op0C))) {
  +      if (match(Op0BO->getOperand(1), m_APInt(Op0C)) && !isLeftShift) {
           if (canShiftBinOpWithConstantRHS(I, Op0BO, *Op0C)) {
             Constant *NewRHS = ConstantExpr::get(I.getOpcode(),
                                        cast<Constant>(Op0BO->getOperand(1)), Op1);

https://reviews.llvm.org/D39421