[llvm] [GlobalISel] Handle div-by-pow2 (PR #83155)

Shilei Tian via llvm-commits llvm-commits at lists.llvm.org
Fri Mar 22 11:21:48 PDT 2024


================
@@ -996,6 +996,105 @@ llvm::ConstantFoldCTLZ(Register Src, const MachineRegisterInfo &MRI) {
   return std::nullopt;
 }
 
+std::optional<SmallVector<unsigned>>
+llvm::ConstantFoldCTTZ(Register Src, const MachineRegisterInfo &MRI) {
+  LLT Ty = MRI.getType(Src);
+  SmallVector<unsigned> FoldedCTTZs;
+  auto tryFoldScalar = [&](Register R) -> std::optional<unsigned> {
+    auto MaybeCst = getIConstantVRegVal(R, MRI);
+    if (!MaybeCst)
+      return std::nullopt;
+    return MaybeCst->countTrailingZeros();
+  };
+  if (Ty.isVector()) {
+    // Try to constant fold each element.
+    auto *BV = getOpcodeDef<GBuildVector>(Src, MRI);
+    if (!BV)
+      return std::nullopt;
+    for (unsigned SrcIdx = 0; SrcIdx < BV->getNumSources(); ++SrcIdx) {
+      if (auto MaybeFold = tryFoldScalar(BV->getSourceReg(SrcIdx))) {
+        FoldedCTTZs.emplace_back(*MaybeFold);
+        continue;
+      }
+      return std::nullopt;
+    }
+    return FoldedCTTZs;
+  }
+  if (auto MaybeCst = tryFoldScalar(Src)) {
+    FoldedCTTZs.emplace_back(*MaybeCst);
+    return FoldedCTTZs;
+  }
+  return std::nullopt;
+}
+
+std::optional<SmallVector<APInt>>
+llvm::ConstantFoldICmp(unsigned Pred, const Register Op1, const Register Op2,
+                       const MachineRegisterInfo &MRI) {
+  LLT Ty = MRI.getType(Op1);
+  if (Ty != MRI.getType(Op2))
+    return std::nullopt;
+
+  auto TryFoldScalar = [&MRI, Pred](Register LHS,
+                                    Register RHS) -> std::optional<APInt> {
+    auto LHSCst = getIConstantVRegVal(LHS, MRI);
----------------
shiltian wrote:

@jayfoad @arsenm @aemerson
It may crash at `getIConstantVRegVal` if `buildInstr` is called in the middle of rewriting of other MI, like the following example:
```
bb.0:
  %0:_(p1) = COPY $vgpr0_vgpr1
  %1:_(s32) = COPY $vgpr2
  %2:_(s32) = COPY $vgpr3
  %3:_(s32) = G_ATOMIC_CMPXCHG %0:_(p1), %1:_, %2:_ :: (load store syncscope("agent-one-as") monotonic monotonic (s32), addrspace 1)
  %3:_(s32), %4:_(s1) = G_ATOMIC_CMPXCHG_WITH_SUCCESS %0:_(p1), %1:_, %2:_ :: (load store syncscope("agent-one-as") monotonic monotonic (s32), addrspace 1)
  S_ENDPGM 0, implicit %3:_(s32), implicit %4:_(s1)
```
As we can see, `%3` is defined twice because the 2nd one is being lowered and will be erased from parent. However, `%3` is `Op1` when building `ICmp` instruction, causing `getIConstantVRegVal` crashes in `MachineRegisterInfo::getVRegDef`:
```
assert((I.atEnd() || std::next(I) == def_instr_end()) && "getVRegDef assumes a single definition or no definition");
```
What is the best practice to handle this here?

https://github.com/llvm/llvm-project/pull/83155


More information about the llvm-commits mailing list