[Mlir-commits] [mlir] [mlir][ControlFlow] Improve time complexity of RegionBranchOpInterface canonicalization patterns (PR #186114)
Yang Bai
llvmlistbot at llvm.org
Fri Mar 13 05:31:18 PDT 2026
================
@@ -1006,39 +993,63 @@ struct RemoveDuplicateSuccessorInputUses : public RewritePattern {
return getArgOrResultNumber(a) < getArgOrResultNumber(b);
});
- // Check every distinct pair of successor inputs for duplicates. Replace
- // `input2` with `input1` if they are duplicates.
+ // Group inputs by their operand "signature" to find duplicates. Two
+ // successor inputs are duplicates if each predecessor (region branch point)
+ // forwards the same value for both. Let n = number of successor inputs and
+ // k = number of predecessors per input. Instead of comparing every pair of
+ // inputs (O(n² * k)), we build a signature for each input and group them
+ // via a std::map.
+ //
+ // A signature is a sorted list of (predecessor, forwarded value) pairs.
+ // Within each group, all but the first (canonical) input are replaced with
+ // the canonical one.
+ using SigEntry = std::pair<Operation *, Value>;
+ using Signature = SmallVector<SigEntry>;
+ auto sigEntryLess = [](const SigEntry &a, const SigEntry &b) {
+ if (a.first != b.first)
+ return a.first < b.first;
+ return a.second.getAsOpaquePointer() < b.second.getAsOpaquePointer();
+ };
+ // The map key is (signature, owner). Two inputs are duplicates only if they
+ // have the same signature AND the same owner (block or defining op). This
+ // ensures we track one canonical per owner group.
+ using MapKey = std::pair<Signature, void *>;
+ auto mapKeyLess = [&](const MapKey &a, const MapKey &b) {
+ if (a.second != b.second)
+ return a.second < b.second;
+ return std::lexicographical_compare(a.first.begin(), a.first.end(),
+ b.first.begin(), b.first.end(),
+ sigEntryLess);
+ };
+ std::map<MapKey, Value, decltype(mapKeyLess)> signatureToCanonical(
----------------
yangtetris wrote:
I chose `std::map` for two reasons:
1. Using `DenseMap` or `unordered_map` requires implementing a hash function for `SmallVector<std::pair<Operation *, Value>>`, which is slightly more complex (~10 extra lines) than providing a comparator.
2. In this scenario, _n_ is typically small, so the O(k log n) lookup cost of `std::map` is comparable to O(k) hashing — the constant factor of computing a hash can outweigh the log n factor. I benchmarked with n=60 and `std::map` was slightly faster than a hash map.
That said, both reasons are minor and I'm open to changing this. If you'd prefer to stay consistent with LLVM ADT, I can switch to SmallDenseMap. WDYT?
https://github.com/llvm/llvm-project/pull/186114
More information about the Mlir-commits
mailing list