[PATCH] D67677: [InstCombine] dropRedundantMaskingOfLeftShiftInput(): pat. a/b with mask (PR42563)

Roman Lebedev via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Sep 19 10:39:24 PDT 2019


lebedev.ri marked 3 inline comments as done.
lebedev.ri added inline comments.


================
Comment at: llvm/lib/Transforms/InstCombine/InstCombineShifts.cpp:176-180
+      Type *ExtendedScalarTy = Type::getIntNTy(Ty->getContext(), 2 * BitWidth);
+      Type *ExtendedTy =
+          Ty->isVectorTy()
+              ? VectorType::get(ExtendedScalarTy, Ty->getVectorNumElements())
+              : ExtendedScalarTy;
----------------
lebedev.ri wrote:
> lebedev.ri wrote:
> > spatel wrote:
> > > Is there a test showing that we need this ext+trunc complexity?
> > > 
> > > See if I've botched this Alive somehow, but the simpler constant mask appears to work:
> > > https://rise4fun.com/Alive/ArQC
> > Hmm. The reason i've gone forward with ext/trunc is: https://rise4fun.com/Alive/o5l
> > In your example alive does not complain because those are constants, and somehow the usual poison rules don't apply?
> > Are we sure this isn't alive limitation, but the correct behavior?
> I.e., i don't think i checked, what happens if `ConstantExpr::getShl()` is called with such out-of-bounds shift amount? Does it correctly handle it, or will ubsan complain, etc?
Tried. No, we can't do this, it breaks the whole point of losslessly handling lanes that need no extra masking.
```
diff --git a/llvm/lib/Transforms/InstCombine/InstCombineShifts.cpp b/llvm/lib/Transforms/InstCombine/InstCombineShifts.cpp
index 3f466495c5e..8db01b4d4bd 100644
--- a/llvm/lib/Transforms/InstCombine/InstCombineShifts.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstCombineShifts.cpp
@@ -171,21 +171,10 @@ dropRedundantMaskingOfLeftShiftInput(BinaryOperator *OuterShift,
       // But for a mask we need to get rid of old masking instruction.
       if (!Masked->hasOneUse())
         return nullptr; // Else we can't perform the fold.
-      // We should produce compute the mask in wider type, and truncate later!
-      // Get type twice as wide element-wise (same number of elements!).
-      Type *ExtendedScalarTy = Type::getIntNTy(Ty->getContext(), 2 * BitWidth);
-      Type *ExtendedTy =
-          Ty->isVectorTy()
-              ? VectorType::get(ExtendedScalarTy, Ty->getVectorNumElements())
-              : ExtendedScalarTy;
-      auto *ExtendedSumOfShAmts =
-          ConstantExpr::getZExt(SumOfShAmts, ExtendedTy);
       // And compute the mask as usual: ~(-1 << (SumOfShAmts))
-      auto *ExtendedAllOnes = ConstantExpr::getAllOnesValue(ExtendedTy);
-      auto *ExtendedInvertedMask =
-          ConstantExpr::getShl(ExtendedAllOnes, ExtendedSumOfShAmts);
-      auto *ExtendedMask = ConstantExpr::getNot(ExtendedInvertedMask);
-      NewMask = ConstantExpr::getTrunc(ExtendedMask, Ty);
+      auto *AllOnes = ConstantExpr::getAllOnesValue(Ty);
+      auto *InvertedMask = ConstantExpr::getShl(AllOnes, SumOfShAmts);
+      NewMask = ConstantExpr::getNot(InvertedMask);
     } else
       NewMask = nullptr; // No mask needed.
     // All good, we can do this fold.
diff --git a/llvm/test/Transforms/InstCombine/partally-redundant-left-shift-input-masking-variant-a.ll b/llvm/test/Transforms/InstCombine/partally-redundant-left-shift-input-masking-variant-a.ll
index 5445275ad1c..235e152d2fe 100644
--- a/llvm/test/Transforms/InstCombine/partally-redundant-left-shift-input-masking-variant-a.ll
+++ b/llvm/test/Transforms/InstCombine/partally-redundant-left-shift-input-masking-variant-a.ll
@@ -82,7 +82,7 @@ define <8 x i32> @t2_vec_nonsplat(<8 x i32> %x, <8 x i32> %nbits) {
 ; CHECK-NEXT:    call void @use8xi32(<8 x i32> [[T2]])
 ; CHECK-NEXT:    call void @use8xi32(<8 x i32> [[T4]])
 ; CHECK-NEXT:    [[TMP1:%.*]] = shl <8 x i32> [[X:%.*]], [[T4]]
-; CHECK-NEXT:    [[T5:%.*]] = and <8 x i32> [[TMP1]], <i32 undef, i32 0, i32 1, i32 2147483647, i32 -1, i32 -1, i32 -1, i32 undef>
+; CHECK-NEXT:    [[T5:%.*]] = and <8 x i32> [[TMP1]], <i32 undef, i32 0, i32 1, i32 2147483647, i32 undef, i32 undef, i32 undef, i32 undef>
 ; CHECK-NEXT:    ret <8 x i32> [[T5]]
 ;
   %t0 = add <8 x i32> %nbits, <i32 -33, i32 -32, i32 -31, i32 -1, i32 0, i32 1, i32 31, i32 32>
diff --git a/llvm/test/Transforms/InstCombine/partally-redundant-left-shift-input-masking-variant-b.ll b/llvm/test/Transforms/InstCombine/partally-redundant-left-shift-input-masking-variant-b.ll
index 6165b579661..0a7c0e5d030 100644
--- a/llvm/test/Transforms/InstCombine/partally-redundant-left-shift-input-masking-variant-b.ll
+++ b/llvm/test/Transforms/InstCombine/partally-redundant-left-shift-input-masking-variant-b.ll
@@ -82,7 +82,7 @@ define <8 x i32> @t2_vec_nonsplat(<8 x i32> %x, <8 x i32> %nbits) {
 ; CHECK-NEXT:    call void @use8xi32(<8 x i32> [[T2]])
 ; CHECK-NEXT:    call void @use8xi32(<8 x i32> [[T4]])
 ; CHECK-NEXT:    [[TMP1:%.*]] = shl <8 x i32> [[X:%.*]], [[T4]]
-; CHECK-NEXT:    [[T5:%.*]] = and <8 x i32> [[TMP1]], <i32 undef, i32 0, i32 1, i32 2147483647, i32 -1, i32 -1, i32 -1, i32 undef>
+; CHECK-NEXT:    [[T5:%.*]] = and <8 x i32> [[TMP1]], <i32 undef, i32 0, i32 1, i32 2147483647, i32 undef, i32 undef, i32 undef, i32 undef>
 ; CHECK-NEXT:    ret <8 x i32> [[T5]]
 ;
   %t0 = add <8 x i32> %nbits, <i32 -33, i32 -32, i32 -31, i32 -1, i32 0, i32 1, i32 31, i32 32>

```


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D67677/new/

https://reviews.llvm.org/D67677





More information about the llvm-commits mailing list