[llvm] [InstCombine] Do not simplify lshr/shl arg if it is part of fshl rotate pattern. (PR #66115)

Fri Sep 22 07:17:28 PDT 2023

================
@@ -669,6 +686,22 @@ Value *InstCombinerImpl::SimplifyDemandedUseBits(Value *V, APInt DemandedMask,
     if (match(I->getOperand(1), m_APInt(SA))) {
       uint64_t ShiftAmt = SA->getLimitedValue(BitWidth-1);
 
+      // Do not simplify if shl is part of fshl rotate pattern
+      if (I->hasOneUser()) {
+        auto *Op = I->user_back();
+        if (Op->getOpcode() == BinaryOperator::Or) {
+          const APInt *ShlAmt;
+          Value *ShlVal;
+          auto *Operand =
+              Op->getOperand(0) == I ? Op->getOperand(1) : Op->getOperand(0);
+          if (match(Operand, m_OneUse(m_Shl(m_Value(ShlVal), m_APInt(ShlAmt)))))
+            if (I->getOperand(0) == ShlVal)
+              if (SA->ult(BitWidth) && ShlAmt->ult(BitWidth) &&
+                  (*SA + *ShlAmt) == BitWidth)
+                return nullptr;
+        }
+      }
----------------
goldsteinn wrote:

Somewhat opposed to this, because its easy for logic to detect "should we be able to do X" to de-sync from the logic that does X.

I would prefer adding a help to InstructionCombiner that can be used for both the rotate creation folds and here to keep the two in sync.

Something along the following lines:
```
std::optional<std::tuple<Intrinsic::ID, Value *, Value *>>
ConvertShlOrLShrToFShlOrFShr(BinaryOperator *I) {
  Value *ShAmt1, *ShAmt2;
  unsigned BitWidth = I->getType()->getBitWidth();
  if (!match(I, m_c_Or(m_OneUse(m_Shl(m_Value(X), m_Value(ShAmt1))),
                       m_OneUse(m_LShr(m_Deferred(X), m_Value(ShAmt2))))))
    return std::nullopt;

  const APint *C1;
  const APint *C2;
  if (match(ShAmt1, m_APint(C1)) && match(ShAmt2, m_APInt(C2))) {
    if (C1->uge(BitWidth) || C2->uge(BitWidth) || (*C1 + *C2) != BitWidth)
      return std::nullopt;
    return {Intrinsic::FShl, X, ShAmt1};
  }

  auto ShAmtsMatchForRotate = [](Value *Amt1, Value *Amt2) {
    return match(Amt1,
                 m_And(m_Neg(m_Specific(Amt2)), m_SpecificInt(BitWidth - 1)));
  };

  if (ShAMtMatchForRotate(ShAmt1, ShAmt2))
    return {Intrinsic::FShl, X, ShAmt1};
  if (ShAMtMatchForRotate(ShAmt2, ShAmt1))
    return {Intrinsic::FShr, X, ShAmt2};
  return std::nullopt;
}
```

If here you do:
```
if(I->hasOneUse() && ConvertShlOrLShrToFShlOrFShr(I->user_back().has_value())
   return nullptr;
```

And at rotate fold you switch to using `ConvertShlOrLShrToFShlOrFShr` to actually create rotates.
Then as pattern detection improves both will stay aligned.

https://github.com/llvm/llvm-project/pull/66115