[llvm] r277635 - [InstCombine] Refactor optimization of zext(or(icmp, icmp)) to enable more aggressive cast-folding

Tobias Grosser via llvm-commits llvm-commits at lists.llvm.org
Wed Aug 3 12:30:35 PDT 2016


Author: grosser
Date: Wed Aug  3 14:30:35 2016
New Revision: 277635

URL: http://llvm.org/viewvc/llvm-project?rev=277635&view=rev
Log:
[InstCombine] Refactor optimization of zext(or(icmp, icmp)) to enable more aggressive cast-folding

Summary:
InstCombine unfolds expressions of the form `zext(or(icmp, icmp))` to `or(zext(icmp), zext(icmp))` such that in a later iteration of InstCombine the exposed `zext(icmp)` instructions can be optimized. We now combine this unfolding and the subsequent `zext(icmp)` optimization to be performed together. Since the unfolding doesn't happen separately anymore, we also again enable the folding of `logic(cast(icmp), cast(icmp))` expressions to `cast(logic(icmp, icmp))` which had been disabled due to its interference with the unfolding transformation.

Tested via `make check` and `lnt`.

Background
==========

For a better understanding on how it came to this change we subsequently summarize its history. In commit r275989 we've already tried to enable the folding of `logic(cast(icmp), cast(icmp))` to `cast(logic(icmp, icmp))` which had to be reverted in r276106 because it could lead to an endless loop in InstCombine (also see http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20160718/374347.html). The root of this problem is that in `visitZExt()` in InstCombineCasts.cpp there also exists a reverse of the above folding transformation, that unfolds `zext(or(icmp, icmp))` to `or(zext(icmp), zext(icmp))` in order to expose `zext(icmp)` operations which would then possibly be eliminated by subsequent iterations of InstCombine. However, before these `zext(icmp)` would be eliminated the folding from r275989 could kick in and cause InstCombine to endlessly switch back and forth between the folding and the unfolding transformation. This is the reason why we now combine the `zext`-unfolding a
 nd the elimination of the exposed `zext(icmp)` to happen at one go because this enables us to still allow the cast-folding in `logic(cast(icmp), cast(icmp))` without entering an endless loop again.

Details on the submitted changes
================================

- In `visitZExt()` we combine the unfolding and optimization of `zext` instructions.
- In `transformZExtICmp()` we have to use `Builder->CreateIntCast()` instead of `CastInst::CreateIntegerCast()` to make sure that the new `CastInst` is inserted in a `BasicBlock`. The new calls to `transformZExtICmp()` that we introduce in `visitZExt()` would otherwise cause according assertions to be triggered (in our case this happend, for example, with lnt for the MultiSource/Applications/sqlite3 and SingleSource/Regression/C++/EH/recursive-throw tests). The subsequent usage of `replaceInstUsesWith()` is necessary to ensure that the new `CastInst` replaces the `ZExtInst` accordingly.
- In InstCombineAndOrXor.cpp we again allow the folding of casts on `icmp` instructions.
- The instruction order in the optimized IR for the zext-or-icmp.ll test case is different with the introduced changes.
- The test cases in zext.ll have been adopted from the reverted commits r275989 and r276105.

Reviewers: grosser, majnemer, spatel

Subscribers: eli.friedman, majnemer, llvm-commits

Differential Revision: https://reviews.llvm.org/D22864

Contributed-by: Matthias Reisinger <d412vv1n at gmail.com>

Modified:
    llvm/trunk/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
    llvm/trunk/lib/Transforms/InstCombine/InstCombineCasts.cpp
    llvm/trunk/test/Transforms/InstCombine/zext-or-icmp.ll
    llvm/trunk/test/Transforms/InstCombine/zext.ll

Modified: llvm/trunk/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp?rev=277635&r1=277634&r2=277635&view=diff
==============================================================================
--- llvm/trunk/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp (original)
+++ llvm/trunk/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp Wed Aug  3 14:30:35 2016
@@ -1209,8 +1209,7 @@ Instruction *InstCombiner::foldCastedBit
   Value *Cast1Src = Cast1->getOperand(0);
 
   // fold logic(cast(A), cast(B)) -> cast(logic(A, B))
-  if ((!isa<ICmpInst>(Cast0Src) || !isa<ICmpInst>(Cast1Src)) &&
-      shouldOptimizeCast(Cast0) && shouldOptimizeCast(Cast1)) {
+  if (shouldOptimizeCast(Cast0) && shouldOptimizeCast(Cast1)) {
     Value *NewOp = Builder->CreateBinOp(LogicOpc, Cast0Src, Cast1Src,
                                         I.getName());
     return CastInst::Create(CastOpcode, NewOp, DestTy);

Modified: llvm/trunk/lib/Transforms/InstCombine/InstCombineCasts.cpp
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/InstCombine/InstCombineCasts.cpp?rev=277635&r1=277634&r2=277635&view=diff
==============================================================================
--- llvm/trunk/lib/Transforms/InstCombine/InstCombineCasts.cpp (original)
+++ llvm/trunk/lib/Transforms/InstCombine/InstCombineCasts.cpp Wed Aug  3 14:30:35 2016
@@ -623,7 +623,9 @@ Instruction *InstCombiner::transformZExt
 
         if (CI.getType() == In->getType())
           return replaceInstUsesWith(CI, In);
-        return CastInst::CreateIntegerCast(In, CI.getType(), false/*ZExt*/);
+
+        Value *IntCast = Builder->CreateIntCast(In, CI.getType(), false);
+        return replaceInstUsesWith(CI, IntCast);
       }
     }
   }
@@ -890,16 +892,26 @@ Instruction *InstCombiner::visitZExt(ZEx
 
   BinaryOperator *SrcI = dyn_cast<BinaryOperator>(Src);
   if (SrcI && SrcI->getOpcode() == Instruction::Or) {
-    // zext (or icmp, icmp) --> or (zext icmp), (zext icmp) if at least one
-    // of the (zext icmp) will be transformed.
+    // zext (or icmp, icmp) -> or (zext icmp), (zext icmp) if at least one
+    // of the (zext icmp) can be eliminated. If so, immediately perform the
+    // according elimination.
     ICmpInst *LHS = dyn_cast<ICmpInst>(SrcI->getOperand(0));
     ICmpInst *RHS = dyn_cast<ICmpInst>(SrcI->getOperand(1));
     if (LHS && RHS && LHS->hasOneUse() && RHS->hasOneUse() &&
         (transformZExtICmp(LHS, CI, false) ||
          transformZExtICmp(RHS, CI, false))) {
+      // zext (or icmp, icmp) -> or (zext icmp), (zext icmp)
       Value *LCast = Builder->CreateZExt(LHS, CI.getType(), LHS->getName());
       Value *RCast = Builder->CreateZExt(RHS, CI.getType(), RHS->getName());
-      return BinaryOperator::Create(Instruction::Or, LCast, RCast);
+      BinaryOperator *Or = BinaryOperator::Create(Instruction::Or, LCast, RCast);
+
+      // Perform the elimination.
+      if (auto *LZExt = dyn_cast<ZExtInst>(LCast))
+        transformZExtICmp(LHS, *LZExt);
+      if (auto *RZExt = dyn_cast<ZExtInst>(RCast))
+        transformZExtICmp(RHS, *RZExt);
+
+      return Or;
     }
   }
 

Modified: llvm/trunk/test/Transforms/InstCombine/zext-or-icmp.ll
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/InstCombine/zext-or-icmp.ll?rev=277635&r1=277634&r2=277635&view=diff
==============================================================================
--- llvm/trunk/test/Transforms/InstCombine/zext-or-icmp.ll (original)
+++ llvm/trunk/test/Transforms/InstCombine/zext-or-icmp.ll Wed Aug  3 14:30:35 2016
@@ -13,8 +13,8 @@ define i8 @zext_or_icmp_icmp(i8 %a, i8 %
 ; CHECK-LABEL: zext_or_icmp_icmp(
 ; CHECK-NEXT:    %mask = and i8 %a, 1
 ; CHECK-NEXT:    %toBool2 = icmp eq i8 %b, 0
-; CHECK-NEXT:    %1 = xor i8 %mask, 1
 ; CHECK-NEXT:    %toBool22 = zext i1 %toBool2 to i8
+; CHECK-NEXT:    %1 = xor i8 %mask, 1
 ; CHECK-NEXT:    %zext = or i8 %1, %toBool22
 ; CHECK-NEXT:    ret i8 %zext
 }

Modified: llvm/trunk/test/Transforms/InstCombine/zext.ll
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/InstCombine/zext.ll?rev=277635&r1=277634&r2=277635&view=diff
==============================================================================
--- llvm/trunk/test/Transforms/InstCombine/zext.ll (original)
+++ llvm/trunk/test/Transforms/InstCombine/zext.ll Wed Aug  3 14:30:35 2016
@@ -70,3 +70,72 @@ define <2 x i64> @fold_xor_zext_sandwich
   ret <2 x i64> %zext2
 }
 
+; Assert that zexts in and(zext(icmp), zext(icmp)) can be folded.
+; CHECK-LABEL: @fold_and_zext_icmp(
+; CHECK-NEXT:    [[ICMP1:%.*]] = icmp sgt i64 %a, %b
+; CHECK-NEXT:    [[ICMP2:%.*]] = icmp slt i64 %a, %c
+; CHECK-NEXT:    [[AND:%.*]] = and i1 [[ICMP1]], [[ICMP2]]
+; CHECK-NEXT:    [[ZEXT:%.*]] = zext i1 [[AND]] to i8
+; CHECK-NEXT:    ret i8 [[ZEXT]]
+define i8 @fold_and_zext_icmp(i64 %a, i64 %b, i64 %c) {
+  %1 = icmp sgt i64 %a, %b
+  %2 = zext i1 %1 to i8
+  %3 = icmp slt i64 %a, %c
+  %4 = zext i1 %3 to i8
+  %5 = and i8 %2, %4
+  ret i8 %5
+}
+
+; Assert that zexts in or(zext(icmp), zext(icmp)) can be folded.
+; CHECK-LABEL: @fold_or_zext_icmp(
+; CHECK-NEXT:    [[ICMP1:%.*]] = icmp sgt i64 %a, %b
+; CHECK-NEXT:    [[ICMP2:%.*]] = icmp slt i64 %a, %c
+; CHECK-NEXT:    [[OR:%.*]] = or i1 [[ICMP1]], [[ICMP2]]
+; CHECK-NEXT:    [[ZEXT:%.*]] = zext i1 [[OR]] to i8
+; CHECK-NEXT:    ret i8 [[ZEXT]]
+define i8 @fold_or_zext_icmp(i64 %a, i64 %b, i64 %c) {
+  %1 = icmp sgt i64 %a, %b
+  %2 = zext i1 %1 to i8
+  %3 = icmp slt i64 %a, %c
+  %4 = zext i1 %3 to i8
+  %5 = or i8 %2, %4
+  ret i8 %5
+}
+
+; Assert that zexts in xor(zext(icmp), zext(icmp)) can be folded.
+; CHECK-LABEL: @fold_xor_zext_icmp(
+; CHECK-NEXT:    [[ICMP1:%.*]] = icmp sgt i64 %a, %b
+; CHECK-NEXT:    [[ICMP2:%.*]] = icmp slt i64 %a, %c
+; CHECK-NEXT:    [[XOR:%.*]] = xor i1 [[ICMP1]], [[ICMP2]]
+; CHECK-NEXT:    [[ZEXT:%.*]] = zext i1 [[XOR]] to i8
+; CHECK-NEXT:    ret i8 [[ZEXT]]
+define i8 @fold_xor_zext_icmp(i64 %a, i64 %b, i64 %c) {
+  %1 = icmp sgt i64 %a, %b
+  %2 = zext i1 %1 to i8
+  %3 = icmp slt i64 %a, %c
+  %4 = zext i1 %3 to i8
+  %5 = xor i8 %2, %4
+  ret i8 %5
+}
+
+; Assert that zexts in logic(zext(icmp), zext(icmp)) are also folded accross
+; nested logical operators.
+; CHECK-LABEL: @fold_nested_logic_zext_icmp(
+; CHECK-NEXT:    [[ICMP1:%.*]] = icmp sgt i64 %a, %b
+; CHECK-NEXT:    [[ICMP2:%.*]] = icmp slt i64 %a, %c
+; CHECK-NEXT:    [[AND:%.*]] = and i1 [[ICMP1]], [[ICMP2]]
+; CHECK-NEXT:    [[ICMP3:%.*]] = icmp eq i64 %a, %d
+; CHECK-NEXT:    [[OR:%.*]] = or i1 [[AND]], [[ICMP3]]
+; CHECK-NEXT:    [[ZEXT:%.*]] = zext i1 [[OR]] to i8
+; CHECK-NEXT:    ret i8 [[ZEXT]]
+define i8 @fold_nested_logic_zext_icmp(i64 %a, i64 %b, i64 %c, i64 %d) {
+  %1 = icmp sgt i64 %a, %b
+  %2 = zext i1 %1 to i8
+  %3 = icmp slt i64 %a, %c
+  %4 = zext i1 %3 to i8
+  %5 = and i8 %2, %4
+  %6 = icmp eq i64 %a, %d
+  %7 = zext i1 %6 to i8
+  %8 = or i8 %5, %7
+  ret i8 %8
+}




More information about the llvm-commits mailing list