[llvm] [VPlan] Speed up VPSlotTracker by using ModuleSlotTracker (PR #139881)
Igor Kirillov via llvm-commits
llvm-commits at lists.llvm.org
Mon May 19 02:53:05 PDT 2025
================
@@ -1441,7 +1441,23 @@ void VPSlotTracker::assignName(const VPValue *V) {
std::string Name;
if (UV) {
raw_string_ostream S(Name);
- UV->printAsOperand(S, false);
+ if (MST) {
+ UV->printAsOperand(S, false, *MST);
+ } else if (isa<Instruction>(UV) && !UV->hasName()) {
+ // Lazily create the ModuleSlotTracker when we first hit an unnamed
+ // instruction
+ auto *IUV = cast<Instruction>(UV);
+ // This check is required to support unit tests with incomplete IR.
+ if (IUV->getParent()) {
+ MST = std::make_unique<ModuleSlotTracker>(IUV->getModule());
----------------
igogo-x86 wrote:
The main slowdown comes from `VPRecipeBase::cost`, which prints debug output like:
```
dbgs() << "Cost of " << RecipeCost << " for VF " << VF << ": ";
dump();
```
or:
```
Cost of 1 for VF 2: WIDEN ir<%1547> = add nsw ir<%1546>, ir<%10>
```
This is where we create and destroy a new `VPSlotTracker` for each line of output. `VPSlotTracker` assigns names to all VPValues in the VPlan every time it’s instantiated. That does not look okay, but I guess we can live with that.
The real issue is that for each underlying Value, `VPSlotTracker `internally creates a `SlotTracker`, which in turn traverses all instructions in the function. That makes dumping a value quadratic in complexity. The time spent initializing `SlotTracker` itself is minor compared to the much larger cost of running `VPSlotTracker::assignNames`.
https://github.com/llvm/llvm-project/pull/139881
More information about the llvm-commits
mailing list